WSL2-Linux-Kernel

Граф коммитов

Автор	SHA1	Сообщение	Дата
Andreas Gruenbacher	6c7410f449	gfs2: gfs2_freeze_lock_shared cleanup All the remaining users of gfs2_freeze_lock_shared() set freeze_gh to &sdp->sd_freeze_gh and flags to 0, so remove those two parameters. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-07-03 22:30:26 +02:00
Andreas Gruenbacher	5432af15f8	gfs2: Replace sd_freeze_state with SDF_FROZEN flag Replace sd_freeze_state with a new SDF_FROZEN flag. There no longer is a need for indicating that a freeze is in progress (SDF_STARTING_FREEZE); we are now protecting the critical sections with the sd_freeze_mutex. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-07-03 22:30:23 +02:00
Andreas Gruenbacher	b77b4a4815	gfs2: Rework freeze / thaw logic So far, at mount time, gfs2 would take the freeze glock in shared mode and then immediately drop it again, turning it into a cached glock that can be reclaimed at any time. To freeze the filesystem cluster-wide, the node initiating the freeze would take the freeze glock in exclusive mode, which would cause the freeze glock's freeze_go_sync() callback to run on each node. There, gfs2 would freeze the filesystem and schedule gfs2_freeze_func() to run. gfs2_freeze_func() would re-acquire the freeze glock in shared mode, thaw the filesystem, and drop the freeze glock again. The initiating node would keep the freeze glock held in exclusive mode. To thaw the filesystem, the initiating node would drop the freeze glock again, which would allow gfs2_freeze_func() to resume on all nodes, leaving the filesystem in the thawed state. It turns out that in freeze_go_sync(), we cannot reliably and safely freeze the filesystem. This is primarily because the final unmount of a filesystem takes a write lock on the s_umount rw semaphore before calling into gfs2_put_super(), and freeze_go_sync() needs to call freeze_super() which also takes a write lock on the same semaphore, causing a deadlock. We could work around this by trying to take an active reference on the super block first, which would prevent unmount from running at the same time. But that can fail, and freeze_go_sync() isn't actually allowed to fail. To get around this, this patch changes the freeze glock locking scheme as follows: At mount time, each node takes the freeze glock in shared mode. To freeze a filesystem, the initiating node first freezes the filesystem locally and then drops and re-acquires the freeze glock in exclusive mode. All other nodes notice that there is contention on the freeze glock in their go_callback callbacks, and they schedule gfs2_freeze_func() to run. There, they freeze the filesystem locally and drop and re-acquire the freeze glock before re-thawing the filesystem. This is happening outside of the glock state engine, so there, we are allowed to fail. From a cluster point of view, taking and immediately dropping a glock is indistinguishable from taking the glock and only dropping it upon contention, so this new scheme is compatible with the old one. Thanks to Li Dong <lidong@vivo.com> for reporting a locking bug in gfs2_freeze_func() in a previous version of this commit. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-07-03 22:25:02 +02:00
Andreas Gruenbacher	cad1e15804	gfs2: Rename SDF_{FS_FROZEN => FREEZE_INITIATOR} Rename the SDF_FS_FROZEN flag to SDF_FREEZE_INITIATOR to indicate more clearly that the node that has this flag set is the initiator of the freeze. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com	2023-06-15 09:57:38 +02:00
Andreas Gruenbacher	e392edd5d5	gfs2: Rename gfs2_freeze_lock{ => _shared } Rename gfs2_freeze_lock to gfs2_freeze_lock_shared to make it a bit more obvious that this function establishes the "thawed" state of the freeze glock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-06-15 09:57:38 +02:00
Andreas Gruenbacher	097cca525a	gfs2: Rename the {freeze,thaw}_super callbacks Rename gfs2_freeze to gfs2_freeze_super and gfs2_unfreeze to gfs2_thaw_super to match the names of the corresponding super operations. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-06-15 09:57:38 +02:00
Andreas Gruenbacher	af1abe1146	gfs2: Rename remaining "transaction" glock references The transaction glock was repurposed to serve as the new freeze glock years ago. Don't refer to it as the transaction glock anymore. Also, to be more precise, call it the "freeze glock" instead of the "freeze lock". Ditto for the journal glock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-06-15 09:57:38 +02:00
Tuo Li	6fa0a72cbb	gfs2: Fix possible data races in gfs2_show_options() Some fields such as gt_logd_secs of the struct gfs2_tune are accessed without holding the lock gt_spin in gfs2_show_options(): val = sdp->sd_tune.gt_logd_secs; if (val != 30) seq_printf(s, ",commit=%d", val); And thus can cause data races when gfs2_show_options() and other functions such as gfs2_reconfigure() are concurrently executed: spin_lock(&gt->gt_spin); gt->gt_logd_secs = newargs->ar_commit; To fix these possible data races, the lock sdp->sd_tune.gt_spin is acquired before accessing the fields of gfs2_tune and released after these accesses. Further changes by Andreas: - Don't hold the spin lock over the seq_printf operations. Reported-by: BassCheck <bass@buaa.edu.cn> Signed-off-by: Tuo Li <islituo@gmail.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-06-13 14:34:17 +02:00
Bob Peterson	f9da18cd46	gfs2: Don't remember delete unless it's successful This patch changes function evict_unlinked_inode so it does not call gfs2_inode_remember_delete until it gets a good return code from gfs2_dinode_dealloc. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-06-06 18:35:06 +02:00
Bob Peterson	9b620429ec	gfs2: ignore rindex_update failure in dinode_dealloc Before this patch, function gfs2_dinode_dealloc would abort if it got a bad return code from gfs2_rindex_update(). The problem is that it left the dinode in the unlinked (not free) state, which meant subsequent fsck would clean it up and flag an error. That meant some of our QE tests would fail. The sole purpose of gfs2_rindex_update(), in this code path, is to read in any newer rgrps added by gfs2_grow. But since this is a delete operation it won't actually use any of those new rgrps. It can really only twiddle the bits from "Unlinked" to "Free" in an existing rgrp. Therefore the error should not prevent the transition from unlinked to free. This patch makes gfs2_dinode_dealloc ignore the bad return code and proceed with freeing the dinode so the QE tests will not be tripped up. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-06-06 18:35:06 +02:00
Bob Peterson	504a10d9e4	gfs2: Don't deref jdesc in evict On corrupt gfs2 file systems the evict code can try to reference the journal descriptor structure, jdesc, after it has been freed and set to NULL. The sequence of events is: init_journal() ... fail_jindex: gfs2_jindex_free(sdp); <------frees journals, sets jdesc = NULL if (gfs2_holder_initialized(&ji_gh)) gfs2_glock_dq_uninit(&ji_gh); fail: iput(sdp->sd_jindex); <--references jdesc in evict_linked_inode evict() gfs2_evict_inode() evict_linked_inode() ret = gfs2_trans_begin(sdp, 0, sdp->sd_jdesc->jd_blocks); <------references the now freed/zeroed sd_jdesc pointer. The call to gfs2_trans_begin is done because the truncate_inode_pages call can cause gfs2 events that require a transaction, such as removing journaled data (jdata) blocks from the journal. This patch fixes the problem by adding a check for sdp->sd_jdesc to function gfs2_evict_inode. In theory, this should only happen to corrupt gfs2 file systems, when gfs2 detects the problem, reports it, then tries to evict all the system inodes it has read in up to that point. Reported-by: Yang Lan <lanyang0908@gmail.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-05-10 17:15:18 +02:00
Bob Peterson	68ca088dc1	gfs2: Perform second log flush in gfs2_make_fs_ro Before this patch, function gfs2_make_fs_ro called gfs2_log_flush once to finalize the log. However, if there's dirty metadata, log flushes tend to sync the metadata and formulate revokes. Before this patch, those revokes may not be written out to the journal immediately, which meant unresolved glocks could still have revokes in their ail lists. When the glock worker runs, it tries to transition the glock, but the unresolved revokes in the ail still need to be written, so it tries to start a transaction. It's impossible to start a transaction because at that point, the SDF_JOURNAL_LIVE flag has been cleared by gfs2_make_fs_ro. That causes the glock worker to fail, unable to write the revokes. The calling sequence looked something like this: gfs2_make_fs_ro gfs2_log_flush - with GFS2_LOG_HEAD_FLUSH_SHUTDOWN flag set if (flags & GFS2_LOG_HEAD_FLUSH_SHUTDOWN) clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags); ...meanwhile... glock_work_func do_xmote rgrp_go_sync (or possibly inode_go_sync) ... gfs2_ail_empty_gl __gfs2_trans_begin if (unlikely(!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))) { ... return -EROFS; The previous patch in the series ("gfs2: return errors from gfs2_ail_empty_gl") now causes the transaction error to no longer be ignored, so it causes a warning from MOST of the xfstests: WARNING: CPU: 11 PID: X at fs/gfs2/super.c:603 gfs2_put_super [gfs2] which corresponds to: WARN_ON(gfs2_withdrawing(sdp)); The withdraw was triggered silently from do_xmote by: if (unlikely(sdp->sd_log_error && !gfs2_withdrawn(sdp))) gfs2_withdraw_delayed(sdp); This patch adds a second log_flush to gfs2_make_fs_ro: one to sync the data and one to sync any outstanding revokes and finalize the journal. Note that both of these log flushes need to be "special," in other words, not GFS2_LOG_HEAD_FLUSH_NORMAL. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-04-25 11:01:41 +02:00
Andreas Gruenbacher	b66f723bb5	gfs2: Improve gfs2_make_fs_rw error handling In gfs2_make_fs_rw(), make sure to call gfs2_consist() to report an inconsistency and mark the filesystem as withdrawn when gfs2_find_jhead() fails. At the end of gfs2_make_fs_rw(), when we discover that the filesystem has been withdrawn, make sure we report an error. This also replaces the gfs2_withdrawn() check after gfs2_find_jhead(). Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: syzbot+f51cb4b9afbd87ec06f2@syzkaller.appspotmail.com Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-01-31 22:40:24 +01:00
Andreas Gruenbacher	b88beb9a24	gfs2: Evict inodes cooperatively Add a gfs2_evict_inodes() helper that evicts inodes cooperatively across the cluster. This avoids running into timeouts during unmount unnecessarily. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-01-31 22:40:24 +01:00
Andreas Gruenbacher	6b388abc33	gfs2: Flush delete work before shrinking inode cache In gfs2_kill_sb(), flush the delete work queue after setting the SDF_DEACTIVATING flag. This ensures that no new inodes will be instantiated anymore, and the inode cache will be empty after the following kill_block_super() -> generic_shutdown_super() -> evict_inodes() call. With that, function gfs2_make_fs_ro() now calls gfs2_flush_delete_work() after the workqueue has been destroyed. Skip that by checking for the presence of the SDF_DEACTIVATING flag. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-01-31 22:40:24 +01:00
Andreas Gruenbacher	f0e56edc2e	gfs2: Split the two kinds of glock "delete" work Function delete_work_func() is used for two purposes: * to immediately try to evict the glock's inode, and * to verify after a little while that the inode has been deleted as expected, and didn't just get skipped. These two operations are not separated very well, so introduce two new glock flags to improved that. Split gfs2_queue_delete_work() into gfs2_queue_try_to_evict and gfs2_queue_verify_evict(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-01-31 22:40:24 +01:00
Andreas Gruenbacher	0247f4e959	gfs2: Move delete workqueue into super block Move the global delete workqueue into struct gfs2_sbd so that we can flush / drain it without interfering with other filesystems. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-01-31 22:40:24 +01:00
Andreas Gruenbacher	2d1439557f	gfs2: Improve gfs2_upgrade_iopen_glock comment Improve the comment describing the inode and iopen glock interactions and the glock poking related to inode evict. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-01-31 22:40:18 +01:00
Andreas Gruenbacher	9ffa18884c	gfs2: gl_object races fix Function glock_clear_object() checks if the specified glock is still pointing at the right object and clears the gl_object pointer. To handle the case of incompletely constructed inodes, glock_clear_object() also allows gl_object to be NULL. However, in the teardown case, when iget_failed() is called and the inode is removed from the inode hash, by the time we get to the glock_clear_object() calls in gfs2_put_super() and its helpers, we don't have exclusion against concurrent gfs2_inode_lookup() and gfs2_create_inode() calls, and the inode and iopen glocks may already be pointing at another inode, so the checks in glock_clear_object() are incorrect. To better handle this case, always completely disassociate an inode from its glocks before tearing it down. In addition, get rid of a duplicate glock_clear_object() call in gfs2_evict_inode(). That way, glock_clear_object() will only ever be called when the glock points at the current inode, and the NULL check in glock_clear_object() can be removed. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2023-01-27 15:55:48 +01:00
Andreas Gruenbacher	fe1bff6517	gfs2: Simply dequeue iopen glock in gfs2_evict_inode With the previous change, to simplify things, we can always just dequeue and uninitialize the iopen glock in gfs2_evict_inode() even if it isn't queued anymore. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2022-12-06 16:06:32 +01:00
Andreas Gruenbacher	764665c677	gfs2: Clean up after gfs2_create_inode rework Since commit `3d36e57ff7` ("gfs2: gfs2_create_inode rework"), gfs2_evict_inode() and gfs2_create_inode() / gfs2_inode_lookup() will synchronize via the inode hash table and we can be certain that once a new inode is inserted into the inode hash table(), gfs2_evict_inode() has completely destroyed any previous versions. We no longer need to worry about overlapping inode object lifespans. Update the code and comments accordingly. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2022-12-06 16:06:31 +01:00
Andreas Gruenbacher	7db354444a	gfs2: Cosmetic gfs2_dinode_{in,out} cleanup In each of the two functions, add an inode variable that points to &ip->i_inode and use that throughout the rest of the function. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2022-12-06 16:06:31 +01:00
Andreas Gruenbacher	38552ff676	gfs2: Fix and clean up create / evict interaction When gfs2_create_inode() fails after creating a new inode, it uses the GIF_FREE_VFS_INODE and GIF_ALLOC_FAILED inode flags to communicate to gfs2_evict_inode() which parts of the inode need to be deallocated and destroyed. In some error cases, the inode ends up being allocated on disk and then accidentally left behind. In others, the inode is partially constructed and then not properly destroyed. Clean this up by completely handling the inode deallocation and destruction in gfs2_evict_inode(). This means that gfs2_evict_inode() may now be faced with partially constructed inodes, so add the necessary checks to cope with that. In particular, make sure that for incompletely constructed inodes, we're not accessing the buffers backing the on-disk blocks; the contents may be undefined. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2022-12-02 15:58:00 +01:00
Andreas Gruenbacher	c7d7d2d345	gfs2: Merge branch 'for-next.nopid' into for-next Resolves a conflict in gfs2_inode_lookup() between the following commits: gfs2: Use TRY lock in gfs2_inode_lookup for UNLINKED inodes gfs2: Mark the remaining process-independent glock holders as GL_NOPID Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2022-10-09 22:56:28 +02:00
Andreas Gruenbacher	53d6913295	gfs2: Instantiate glocks ouside of glock state engine Instantiate glocks outside of the glock state engine: there is no real reason for instantiating them inside the glock state engine; it only complicates the code. Instead, instantiate them in gfs2_glock_wait() and gfs2_glock_async_wait() using the new gfs2_glock_holder_ready() helper. On top of that, the only other place that acquires a glock without using gfs2_glock_wait() or gfs2_glock_async_wait() is gfs2_upgrade_iopen_glock(), so call gfs2_glock_holder_ready() there as well. If a dinode has a pending truncate, the glock-specific instantiate function for inodes wakes up the truncate function in the quota daemon. Waiting for the completion of the truncate was previously done by the glock state engine, but we now need to wait in inode_go_instantiate(). This also means that gfs2_instantiate() will now no longer return any "special" error codes. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2022-06-29 16:53:22 +02:00
Andreas Gruenbacher	ebdc416c9c	gfs2: Mark the remaining process-independent glock holders as GL_NOPID Add the GL_NOPID flag for the remaining glock holders which are not associated with the current process. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2022-06-29 13:07:54 +02:00
Linus Torvalds	3d198e42ce	gfs2 fixes * To avoid deadlocks, actively cancel dlm locking requests when we give up on them. Further dlm operations on the same lock will return -EBUSY until the cancel has been completed, so in that case, wait and repeat. (This is rare.) * Lock inversion fixes in gfs2_inode_lookup() and gfs2_create_inode(). * Some more fallout from the gfs2 mmap + page fault deadlock fixes (merge `c03098d4b9`). * Various other minor bug fixes and cleanups. -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmJGCAsUHGFncnVlbmJh QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTrWcg//TEDazop2y7rGMFsMBXI7HPyBu4uD BwoclS5IfjoQbBTtkl7cWmQViMk8s3EFGxdEBorfGmMEq65I/krHi4JXG2GETdui ORoi8NH1sW9H2GJXmwtE2wYZlJBZtdntoBGdPXWFvt1hLajf6WGpy/CR1Wd4rYak 8AHQxtd98OtsA6LAPlWl2UaXS4m7rhEt0Iy83mqWtbBOvZsULczuraazawnoQ/m4 Wf5pvb+73hpwTVUkruH0+If+vi/HF0WVv1nZVyMwrSh3mpvkrsZSkbN0fd0veAhD b5XGI1dD5+YPxAOdwDKqnqy8/E3gRekybmpcd48BXoxF4EX/AlLX/Zn9qnrAhY6M qEbGzC2UqLIrPe/KjzQ8+0aKPCY5FB1VqoRMAHC/bj7mlmNgGtHxQUXdDmC4LIi6 GOLpnueI1KtA7Hb4HCgX0BLxSqUEhUuGssBkNIqGet1cRwmM33pt1J4CG4TDLBt/ VZiERnN3qktSlmukvd3oLSZso4fVbg7PyFTl8YMgiLDNfgcZI9RY5qwIJYrOaucr KTNfR6lAL2slFPIVcLwmgJt+axogk6GnCkfDVMX2VLJnMQYqJnDYn6fVG9jngSB+ F4UBZ/alzhpel08r8xtxjADFJzA+weG1I2jnikSLKlgVN+uiQTBrhqyWdtxtEqFM 31Nd7piiSVQEvrM= =xMYz -----END PGP SIGNATURE----- Merge tag 'gfs2-v5.17-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fixes from Andreas Gruenbacher: - To avoid deadlocks, actively cancel dlm locking requests when we give up on them. Further dlm operations on the same lock will return -EBUSY until the cancel has been completed, so in that case, wait and repeat. (This is rare.) - Lock inversion fixes in gfs2_inode_lookup() and gfs2_create_inode(). - Some more fallout from the gfs2 mmap + page fault deadlock fixes (merged in commit c03098d4b9ad7: "Merge tag 'gfs2-v5.15-rc5-mmap-fault'"). - Various other minor bug fixes and cleanups. * tag 'gfs2-v5.17-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Make sure FITRIM minlen is rounded up to fs block size gfs2: Make sure not to return short direct writes gfs2: Remove dead code in gfs2_file_read_iter gfs2: Fix gfs2_file_buffered_write endless loop workaround gfs2: Minor retry logic cleanup gfs2: Disable page faults during lockless buffered reads gfs2: Fix should_fault_in_pages() logic gfs2: Remove return value for gfs2_indirect_init gfs2: Initialize gh_error in gfs2_glock_nq gfs2: Make use of list_is_first gfs2: Switch lock order of inode and iopen glock gfs2: cancel timed-out glock requests gfs2: Expect -EBUSY after canceling dlm locking requests gfs2: gfs2_setattr_size error path fix gfs2: assign rgrp glock before compute_bitstructs	2022-03-31 15:57:50 -07:00
Muchun Song	fd60b28842	fs: allocate inode by using alloc_inode_sb() The inode allocation is supposed to use alloc_inode_sb(), so convert kmem_cache_alloc() of all filesystems to alloc_inode_sb(). Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4] Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Alex Shi <alexs@kernel.org> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kari Argillander <kari.argillander@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-03-22 15:57:03 -07:00
Andreas Gruenbacher	7336905a89	gfs2: gfs2_setattr_size error path fix When gfs2_setattr_size() fails, it calls gfs2_rs_delete(ip, NULL) to get rid of any reservations the inode may have. Instead, it should pass in the inode's write count as the second parameter to allow gfs2_rs_delete() to figure out if the inode has any writers left. In a next step, there are two instances of gfs2_rs_delete(ip, NULL) left where we know that there can be no other users of the inode. Replace those with gfs2_rs_deltree(&ip->i_res) to avoid the unnecessary write count check. With that, gfs2_rs_delete() is only called with the inode's actual write count, so get rid of the second parameter. Fixes: `a097dc7e24` ("GFS2: Make rgrp reservations part of the gfs2_inode structure") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2022-02-15 15:01:40 +01:00
Andreas Gruenbacher	8d567162ef	gfs2: Remove redundant check for GLF_INSTANTIATE_NEEDED If the GLF_INSTANTIATE_NEEDED flag isn't set, gfs2_instantiate() is a no-op. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-12-04 20:02:26 +01:00
Bob Peterson	49462e2be1	gfs2: release iopen glock early in evict Before this patch, evict would clear the iopen glock's gl_object after releasing the inode glock. In the meantime, another process could reuse the same block and thus glocks for a new inode. It would lock the inode glock (exclusively), and then the iopen glock (shared). The shared locking mode doesn't provide any ordering against the evict, so by the time the iopen glock is reused, evict may not have gotten to setting gl_object to NULL. Fix that by releasing the iopen glock before the inode glock in gfs2_evict_inode. Signed-off-by: Bob Peterson <rpeterso@redhat.com>gl_object Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-11-06 10:25:31 +01:00
Bob Peterson	ec1d398dd7	gfs2: Eliminate GIF_INVALID flag With the addition of the new GLF_INSTANTIATE_NEEDED flag, the GIF_INVALID flag is now redundant. This patch removes it. Since inode_instantiate is only called when instantiation is needed, the check in inode_instantiate is removed too. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-10-25 08:42:19 +02:00
Bob Peterson	f2e70d8f2f	gfs2: fix GL_SKIP node_scope problems Before this patch, when a glock was locked, the very first holder on the queue would unlock the lockref and call the go_instantiate glops function (if one existed), unless GL_SKIP was specified. When we introduced the new node-scope concept, we allowed multiple holders to lock glocks in EX mode and share the lock. But node-scope introduced a new problem: if the first holder has GL_SKIP and the next one does NOT, since it is not the first holder on the queue, the go_instantiate op was not called. Eventually the GL_SKIP holder may call the instantiate sub-function (e.g. gfs2_rgrp_bh_get) but there was still a window of time in which another non-GL_SKIP holder assumes the instantiate function had been called by the first holder. In the case of rgrp glocks, this led to a NULL pointer dereference on the buffer_heads. This patch tries to fix the problem by introducing two new glock flags: GLF_INSTANTIATE_NEEDED, which keeps track of when the instantiate function needs to be called to "fill in" or "read in" the object before it is referenced. GLF_INSTANTIATE_IN_PROG which is used to determine when a process is in the process of reading in the object. Whenever a function needs to reference the object, it checks the GLF_INSTANTIATE_NEEDED flag, and if set, it sets GLF_INSTANTIATE_IN_PROG and calls the glops "go_instantiate" function. As before, the gl_lockref spin_lock is unlocked during the IO operation, which may take a relatively long amount of time to complete. While unlocked, if another process determines go_instantiate is still needed, it sees GLF_INSTANTIATE_IN_PROG is set, and waits for the go_instantiate glop operation to be completed. Once GLF_INSTANTIATE_IN_PROG is cleared, it needs to check GLF_INSTANTIATE_NEEDED again because the other process's go_instantiate operation may not have been successful. Functions that previously called the instantiate sub-functions now call directly into gfs2_instantiate so the new bits are managed properly. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-10-25 08:42:19 +02:00
Bob Peterson	ba3ca2bcf4	gfs2: nit: gfs2_drop_inode shouldn't return bool Today, gfs2_drop_inode can return "false" for an int value. I'm sure this was just an oversight. Change to int value. Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2021-08-20 09:03:46 -05:00
Bob Peterson	70c11ba8f2	gfs2: Don't release and reacquire local statfs bh Before this patch, several functions in gfs2 related to the updating of the statfs file used a newly acquired/read buffer_head for the local statfs file. This is completely unnecessary, because other nodes should never update it. Recreating the buffer is a waste of time. This patch allows gfs2 to read in the local statefs buffer_head at mount time and keep it around until unmount time. Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2021-08-20 09:03:46 -05:00
Bob Peterson	a28dc123fa	gfs2: init system threads before freeze lock Patch `96b1454f2e` ("gfs2: move freeze glock outside the make_fs_rw and _ro functions") changed the gfs2 mount sequence so that it holds the freeze lock before calling gfs2_make_fs_rw. Before this patch, gfs2_make_fs_rw called init_threads to initialize the quotad and logd threads. That is a problem if the system needs to withdraw due to IO errors early in the mount sequence, for example, while initializing the system statfs inode: 1. An IO error causes the statfs glock to not sync properly after recovery, and leaves items on the ail list. 2. The leftover items on the ail list causes its do_xmote call to fail, which makes it want to withdraw. But since the glock code cannot withdraw (because the withdraw sequence uses glocks) it relies upon the logd daemon to initiate the withdraw. 3. The withdraw can never be performed by the logd daemon because all this takes place before the logd daemon is started. This patch moves function init_threads from super.c to ops_fstype.c and it changes gfs2_fill_super to start its threads before holding the freeze lock, and if there's an error, stop its threads after releasing it. This allows the logd to run unblocked by the freeze lock. Thus, the logd daemon can perform its withdraw sequence properly. Fixes: `96b1454f2e` ("gfs2: move freeze glock outside the make_fs_rw and _ro functions") Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2021-08-20 09:01:02 -05:00
Lee Jones	c551f66c5d	gfs2: Fix a number of kernel-doc warnings Building the kernel with W=1 results in a number of kernel-doc warnings like incorrect function names and parameter descriptions. Fix those, mostly by adding missing parameter descriptions, removing left-over descriptions, and demoting some less important kernel-doc comments into regular comments. Originally proposed by Lee Jones; improved and combined into a single patch by Andreas. Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-04-09 22:14:13 +02:00
Bob Peterson	ff132c5f93	gfs2: report "already frozen/thawed" errors Before this patch, gfs2's freeze function failed to report an error when the target file system was already frozen as it should (and as generic vfs function freeze_super does. Similarly, gfs2's thaw function failed to report an error when trying to thaw a file system that is not frozen, as vfs function thaw_super does. The errors were checked, but it always returned a 0 return code. This patch adds the missing error return codes to gfs2 freeze and thaw. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-03-25 18:53:38 +01:00
Andrew Price	62dd0f98a0	gfs2: Flag a withdraw if init_threads() fails Interrupting mount with ^C quickly enough can cause the kthread_run() calls in gfs2's init_threads() to fail and the error path leads to a deadlock on the s_umount rwsem. The abridged chain of events is: [mount path] get_tree_bdev() sget_fc() alloc_super() down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); [acquired] gfs2_fill_super() gfs2_make_fs_rw() init_threads() kthread_run() ( Interrupted ) [Error path] gfs2_gl_hash_clear() flush_workqueue(glock_workqueue) wait_for_completion() [workqueue context] glock_work_func() run_queue() do_xmote() freeze_go_sync() freeze_super() down_write(&sb->s_umount) [deadlock] In freeze_go_sync() there is a gfs2_withdrawn() check that we can use to make sure freeze_super() is not called in the error path, so add a gfs2_withdraw_delayed() call when init_threads() fails. Ref: https://bugzilla.kernel.org/show_bug.cgi?id=212231 Reported-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-03-15 15:32:42 +01:00
Yang Li	eb602521f4	gfs2: make function gfs2_make_fs_ro() to void type It fixes the following warning detected by coccinelle: ./fs/gfs2/super.c:592:5-10: Unneeded variable: "error". Return "0" on line 628 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-03-07 17:04:55 +01:00
Linus Torvalds	f6e1e1d1e1	Changes in gfs2: * Log space and revoke accounting rework to fix some failed asserts. * Local resource group glock sharing for better local performance. * Add support for version 1802 filesystems: trusted xattr support and '-o rgrplvb' mounts by default. * Actually synchronize on the inode glock's FREEING bit during withdraw ("gfs2: fix glock confusion in function signal_our_withdraw"). * Fix parallel recovery of multiple journals ("gfs2: keep bios separate for each journal"). * Various other bug fixes. -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmA1TmwUHGFncnVlbmJh QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTpDZhAArnFj5AhWMI2+DD5o05EILdgDSpwh JWYT1pfRqR1OZrs7ZZ7tGZB4H6oytYfJ+4mg9Kk7CE7oJKcBh695IPZoIWv8+BCC WIgQGJytCFp4tuDNw11HZ0ahgW4zXPyJTt6jidZ5jVkux31JrUS7fVqSsD2vIPqA iQMcJIH+NLTlYbNt4d5T/ngaoRcx7m18RWkcxf6Y+/DBnnwIe4ZDpZmkWVykuncv OFSvXK8vKyLWGnvH/MIsywfYeU5rj/0AIu66JhVILQ4v5kGYIigwY3quXP2SoITM Z0+N5Gj/N4OWSscRS86zyqhnRucrjDkNP2+oGSzJWgtSXE/KplyfInAmQWzhIPRM n7T0boTp+gOTzGq7ELCzj44KICLG76WgDwaR2bLHuQ2/ppVrHNltZqncP2iwynN6 glfST/eHBUBu1qTYLaOAfkUBlhpKDXu0YPcXX7lH6M0JqyvkRUFfuBAU9dic9D9K zsxplHGJrZnE9QFWWbS3aOviPlSHaXfkZF0Xv7QCLyuPRhu+e/qfcAoeVhxSd4+e I0grs/TxM61jyju9SmqnM7P+8qYS55naYH1V+6iNCU5dax8MvdxNZuneBQIa07U+ Y84JPQvTBZDUE0gZ8fUzZtnYS7RqyiG7BL+T4W5Ph7LgxXbgQD7CWerYpg7fBm/j HEpjKqrS96zfTyk= =45VG -----END PGP SIGNATURE----- Merge tag 'gfs2-for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Log space and revoke accounting rework to fix some failed asserts. - Local resource group glock sharing for better local performance. - Add support for version 1802 filesystems: trusted xattr support and '-o rgrplvb' mounts by default. - Actually synchronize on the inode glock's FREEING bit during withdraw ("gfs2: fix glock confusion in function signal_our_withdraw"). - Fix parallel recovery of multiple journals ("gfs2: keep bios separate for each journal"). - Various other bug fixes. * tag 'gfs2-for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (49 commits) gfs2: Don't get stuck with I/O plugged in gfs2_ail1_flush gfs2: Per-revoke accounting in transactions gfs2: Rework the log space allocation logic gfs2: Minor calc_reserved cleanup gfs2: Use resource group glock sharing gfs2: Allow node-wide exclusive glock sharing gfs2: Add local resource group locking gfs2: Add per-reservation reserved block accounting gfs2: Rename rs_{free -> requested} and rd_{reserved -> requested} gfs2: Check for active reservation in gfs2_release gfs2: Don't search for unreserved space twice gfs2: Only pass reservation down to gfs2_rbm_find gfs2: Also reflect single-block allocations in rgd->rd_extfail_pt gfs2: Recursive gfs2_quota_hold in gfs2_iomap_end gfs2: Add trusted xattr support gfs2: Enable rgrplvb for sb_fs_format 1802 gfs2: Don't skip dlm unlock if glock has an lvb gfs2: Lock imbalance on error path in gfs2_recover_one gfs2: Move function gfs2_ail_empty_tr gfs2: Get rid of current_tail() ...	2021-02-23 14:04:04 -08:00
Andreas Gruenbacher	803074ad77	Merge branches 'rgrp-glock-sharing' and 'gfs2-revoke' from https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git Merge the resource group glock sharing feature and the revoke accounting rework. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-02-23 18:54:22 +01:00
Bob Peterson	4fc7ec31c3	gfs2: Use resource group glock sharing This patch takes advantage of the new glock holder sharing feature for resource groups. We have already introduced local resource group locking in a previous patch, so competing accesses of local processes are already under control. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-02-17 19:30:28 +01:00
Andreas Gruenbacher	f3708fb59f	gfs2: Get rid of sd_reserving_log This counter and the associated wait queue are only used so that gfs2_make_fs_ro can efficiently wait for all pending log space allocations to fail after setting the filesystem to read-only. This comes at the cost of waking up that wait queue very frequently. Instead, when gfs2_log_reserve fails because the filesystem has become read-only, Wake up sd_log_waitq. In gfs2_make_fs_ro, set the file system read-only and then wait until all the log space has been released. Give up and report the problem after a while. With that, sd_reserving_log and sd_reserving_log_wait can be removed. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-02-03 18:37:24 +01:00
Andreas Gruenbacher	736b2f778f	gfs2: Un-obfuscate function jdesc_find_i Clean up this function to show that it is trivial. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2021-01-19 21:16:43 +01:00
Eric Biggers	e2728c5621	fs: don't call ->dirty_inode for lazytime timestamp updates There is no need to call ->dirty_inode for lazytime timestamp updates (i.e. for __mark_inode_dirty(I_DIRTY_TIME)), since by the definition of lazytime, filesystems must ignore these updates. Filesystems only need to care about the updated timestamps when they expire. Therefore, only call ->dirty_inode when I_DIRTY_INODE is set. Based on a patch from Christoph Hellwig: https://lore.kernel.org/r/20200325122825.1086872-4-hch@lst.de Link: https://lore.kernel.org/r/20210112190253.64307-6-ebiggers@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jan Kara <jack@suse.cz>	2021-01-13 17:26:33 +01:00
Bob Peterson	96b1454f2e	gfs2: move freeze glock outside the make_fs_rw and _ro functions Before this patch, sister functions gfs2_make_fs_rw and gfs2_make_fs_ro locked (held) the freeze glock by calling gfs2_freeze_lock and gfs2_freeze_unlock. The problem is, not all the callers of gfs2_make_fs_ro should be doing this. The three callers of gfs2_make_fs_ro are: remount (gfs2_reconfigure), signal_our_withdraw, and unmount (gfs2_put_super). But when unmounting the file system we can get into the following circular lock dependency: deactivate_super down_write(&s->s_umount); <-------------------------------------- s_umount deactivate_locked_super gfs2_kill_sb kill_block_super generic_shutdown_super gfs2_put_super gfs2_make_fs_ro gfs2_glock_nq_init sd_freeze_gl freeze_go_sync if (freeze glock in SH) freeze_super (vfs) down_write(&sb->s_umount); <------- s_umount This patch moves the hold of the freeze glock outside the two sister rw/ro functions to their callers, but it doesn't request the glock from gfs2_put_super, thus eliminating the circular dependency. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-12-23 00:54:21 +01:00
Bob Peterson	c77b52c0a1	gfs2: Add common helper for holding and releasing the freeze glock Many places in the gfs2 code queued and dequeued the freeze glock. Almost all of them acquire it in SHARED mode, and need to specify the same LM_FLAG_NOEXP and GL_EXACT flags. This patch adds common helper functions gfs2_freeze_lock and gfs2_freeze_unlock to make the code more readable, and to prepare for the next patch. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-12-23 00:54:13 +01:00
Bob Peterson	dd64fe8167	gfs2: Remove sb_start_write from gfs2_statfs_sync Before this patch, function gfs2_statfs_sync called sb_start_write and sb_end_write. This is completely unnecessary because, aside from grabbing glocks, gfs2_statfs_sync does all its updates to statfs with a transaction: gfs2_trans_begin and _end. And transactions always do sb_start_intwrite in gfs2_trans_begin and sb_end_intwrite in gfs2_trans_end. This patch simply removes the call to sb_start_write. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-12-03 17:04:24 +01:00
Bob Peterson	a9dd945cce	gfs2: Add missing truncate_inode_pages_final for sd_aspace Gfs2 creates an address space for its rgrps called sd_aspace, but it never called truncate_inode_pages_final on it. This confused vfs greatly which tried to reference the address space after gfs2 had freed the superblock that contained it. This patch adds a call to truncate_inode_pages_final for sd_aspace, thus avoiding the use-after-free. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-10-29 22:16:46 +01:00

1 2 3 4 5 ...

296 Коммитов