2012-11-29 08:28:09 +04:00
|
|
|
/*
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* fs/f2fs/segment.c
|
|
|
|
*
|
|
|
|
* Copyright (c) 2012 Samsung Electronics Co., Ltd.
|
|
|
|
* http://www.samsung.com/
|
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or modify
|
|
|
|
* it under the terms of the GNU General Public License version 2 as
|
|
|
|
* published by the Free Software Foundation.
|
|
|
|
*/
|
|
|
|
#include <linux/fs.h>
|
|
|
|
#include <linux/f2fs_fs.h>
|
|
|
|
#include <linux/bio.h>
|
|
|
|
#include <linux/blkdev.h>
|
2012-12-20 01:19:30 +04:00
|
|
|
#include <linux/prefetch.h>
|
2014-04-02 10:34:36 +04:00
|
|
|
#include <linux/kthread.h>
|
2013-11-22 05:09:59 +04:00
|
|
|
#include <linux/swap.h>
|
2015-10-06 00:49:57 +03:00
|
|
|
#include <linux/timer.h>
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
#include "f2fs.h"
|
|
|
|
#include "segment.h"
|
|
|
|
#include "node.h"
|
2014-12-18 06:58:58 +03:00
|
|
|
#include "trace.h"
|
2013-04-23 12:51:43 +04:00
|
|
|
#include <trace/events/f2fs.h>
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
2013-11-15 05:42:51 +04:00
|
|
|
#define __reverse_ffz(x) __reverse_ffs(~(x))
|
|
|
|
|
2013-11-15 08:55:58 +04:00
|
|
|
static struct kmem_cache *discard_entry_slab;
|
2016-08-29 18:58:34 +03:00
|
|
|
static struct kmem_cache *bio_entry_slab;
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
static struct kmem_cache *sit_entry_set_slab;
|
2014-10-07 04:39:50 +04:00
|
|
|
static struct kmem_cache *inmem_entry_slab;
|
2013-11-15 08:55:58 +04:00
|
|
|
|
2015-10-21 01:17:19 +03:00
|
|
|
static unsigned long __reverse_ulong(unsigned char *str)
|
|
|
|
{
|
|
|
|
unsigned long tmp = 0;
|
|
|
|
int shift = 24, idx = 0;
|
|
|
|
|
|
|
|
#if BITS_PER_LONG == 64
|
|
|
|
shift = 56;
|
|
|
|
#endif
|
|
|
|
while (shift >= 0) {
|
|
|
|
tmp |= (unsigned long)str[idx++] << shift;
|
|
|
|
shift -= BITS_PER_BYTE;
|
|
|
|
}
|
|
|
|
return tmp;
|
|
|
|
}
|
|
|
|
|
2013-11-15 05:42:51 +04:00
|
|
|
/*
|
|
|
|
* __reverse_ffs is copied from include/asm-generic/bitops/__ffs.h since
|
|
|
|
* MSB and LSB are reversed in a byte by f2fs_set_bit.
|
|
|
|
*/
|
|
|
|
static inline unsigned long __reverse_ffs(unsigned long word)
|
|
|
|
{
|
|
|
|
int num = 0;
|
|
|
|
|
|
|
|
#if BITS_PER_LONG == 64
|
2015-10-21 01:17:19 +03:00
|
|
|
if ((word & 0xffffffff00000000UL) == 0)
|
2013-11-15 05:42:51 +04:00
|
|
|
num += 32;
|
2015-10-21 01:17:19 +03:00
|
|
|
else
|
2013-11-15 05:42:51 +04:00
|
|
|
word >>= 32;
|
|
|
|
#endif
|
2015-10-21 01:17:19 +03:00
|
|
|
if ((word & 0xffff0000) == 0)
|
2013-11-15 05:42:51 +04:00
|
|
|
num += 16;
|
2015-10-21 01:17:19 +03:00
|
|
|
else
|
2013-11-15 05:42:51 +04:00
|
|
|
word >>= 16;
|
2015-10-21 01:17:19 +03:00
|
|
|
|
|
|
|
if ((word & 0xff00) == 0)
|
2013-11-15 05:42:51 +04:00
|
|
|
num += 8;
|
2015-10-21 01:17:19 +03:00
|
|
|
else
|
2013-11-15 05:42:51 +04:00
|
|
|
word >>= 8;
|
2015-10-21 01:17:19 +03:00
|
|
|
|
2013-11-15 05:42:51 +04:00
|
|
|
if ((word & 0xf0) == 0)
|
|
|
|
num += 4;
|
|
|
|
else
|
|
|
|
word >>= 4;
|
2015-10-21 01:17:19 +03:00
|
|
|
|
2013-11-15 05:42:51 +04:00
|
|
|
if ((word & 0xc) == 0)
|
|
|
|
num += 2;
|
|
|
|
else
|
|
|
|
word >>= 2;
|
2015-10-21 01:17:19 +03:00
|
|
|
|
2013-11-15 05:42:51 +04:00
|
|
|
if ((word & 0x2) == 0)
|
|
|
|
num += 1;
|
|
|
|
return num;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2014-08-06 18:22:50 +04:00
|
|
|
* __find_rev_next(_zero)_bit is copied from lib/find_next_bit.c because
|
2013-11-15 05:42:51 +04:00
|
|
|
* f2fs_set_bit makes MSB and LSB reversed in a byte.
|
2015-11-12 03:43:04 +03:00
|
|
|
* @size must be integral times of unsigned long.
|
2013-11-15 05:42:51 +04:00
|
|
|
* Example:
|
2015-10-21 01:17:19 +03:00
|
|
|
* MSB <--> LSB
|
|
|
|
* f2fs_set_bit(0, bitmap) => 1000 0000
|
|
|
|
* f2fs_set_bit(7, bitmap) => 0000 0001
|
2013-11-15 05:42:51 +04:00
|
|
|
*/
|
|
|
|
static unsigned long __find_rev_next_bit(const unsigned long *addr,
|
|
|
|
unsigned long size, unsigned long offset)
|
|
|
|
{
|
|
|
|
const unsigned long *p = addr + BIT_WORD(offset);
|
2015-11-12 03:43:04 +03:00
|
|
|
unsigned long result = size;
|
2013-11-15 05:42:51 +04:00
|
|
|
unsigned long tmp;
|
|
|
|
|
|
|
|
if (offset >= size)
|
|
|
|
return size;
|
|
|
|
|
2015-11-12 03:43:04 +03:00
|
|
|
size -= (offset & ~(BITS_PER_LONG - 1));
|
2013-11-15 05:42:51 +04:00
|
|
|
offset %= BITS_PER_LONG;
|
2015-10-21 01:17:19 +03:00
|
|
|
|
2015-11-12 03:43:04 +03:00
|
|
|
while (1) {
|
|
|
|
if (*p == 0)
|
|
|
|
goto pass;
|
2013-11-15 05:42:51 +04:00
|
|
|
|
2015-10-21 01:17:19 +03:00
|
|
|
tmp = __reverse_ulong((unsigned char *)p);
|
2015-11-12 03:43:04 +03:00
|
|
|
|
|
|
|
tmp &= ~0UL >> offset;
|
|
|
|
if (size < BITS_PER_LONG)
|
|
|
|
tmp &= (~0UL << (BITS_PER_LONG - size));
|
2013-11-15 05:42:51 +04:00
|
|
|
if (tmp)
|
2015-11-12 03:43:04 +03:00
|
|
|
goto found;
|
|
|
|
pass:
|
|
|
|
if (size <= BITS_PER_LONG)
|
|
|
|
break;
|
2013-11-15 05:42:51 +04:00
|
|
|
size -= BITS_PER_LONG;
|
2015-11-12 03:43:04 +03:00
|
|
|
offset = 0;
|
2015-10-21 01:17:19 +03:00
|
|
|
p++;
|
2013-11-15 05:42:51 +04:00
|
|
|
}
|
2015-11-12 03:43:04 +03:00
|
|
|
return result;
|
|
|
|
found:
|
|
|
|
return result - size + __reverse_ffs(tmp);
|
2013-11-15 05:42:51 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static unsigned long __find_rev_next_zero_bit(const unsigned long *addr,
|
|
|
|
unsigned long size, unsigned long offset)
|
|
|
|
{
|
|
|
|
const unsigned long *p = addr + BIT_WORD(offset);
|
2015-12-05 03:51:13 +03:00
|
|
|
unsigned long result = size;
|
2013-11-15 05:42:51 +04:00
|
|
|
unsigned long tmp;
|
|
|
|
|
|
|
|
if (offset >= size)
|
|
|
|
return size;
|
|
|
|
|
2015-12-05 03:51:13 +03:00
|
|
|
size -= (offset & ~(BITS_PER_LONG - 1));
|
2013-11-15 05:42:51 +04:00
|
|
|
offset %= BITS_PER_LONG;
|
2015-12-05 03:51:13 +03:00
|
|
|
|
|
|
|
while (1) {
|
|
|
|
if (*p == ~0UL)
|
|
|
|
goto pass;
|
|
|
|
|
2015-10-21 01:17:19 +03:00
|
|
|
tmp = __reverse_ulong((unsigned char *)p);
|
2015-12-05 03:51:13 +03:00
|
|
|
|
|
|
|
if (offset)
|
|
|
|
tmp |= ~0UL << (BITS_PER_LONG - offset);
|
|
|
|
if (size < BITS_PER_LONG)
|
|
|
|
tmp |= ~0UL >> size;
|
2015-10-21 01:17:19 +03:00
|
|
|
if (tmp != ~0UL)
|
2015-12-05 03:51:13 +03:00
|
|
|
goto found;
|
|
|
|
pass:
|
|
|
|
if (size <= BITS_PER_LONG)
|
|
|
|
break;
|
2013-11-15 05:42:51 +04:00
|
|
|
size -= BITS_PER_LONG;
|
2015-12-05 03:51:13 +03:00
|
|
|
offset = 0;
|
2015-10-21 01:17:19 +03:00
|
|
|
p++;
|
2013-11-15 05:42:51 +04:00
|
|
|
}
|
2015-12-05 03:51:13 +03:00
|
|
|
return result;
|
|
|
|
found:
|
|
|
|
return result - size + __reverse_ffz(tmp);
|
2013-11-15 05:42:51 +04:00
|
|
|
}
|
|
|
|
|
2014-10-07 04:39:50 +04:00
|
|
|
void register_inmem_page(struct inode *inode, struct page *page)
|
|
|
|
{
|
|
|
|
struct f2fs_inode_info *fi = F2FS_I(inode);
|
|
|
|
struct inmem_pages *new;
|
2014-12-05 21:39:49 +03:00
|
|
|
|
2014-12-18 06:58:58 +03:00
|
|
|
f2fs_trace_pid(page);
|
2014-12-05 22:58:02 +03:00
|
|
|
|
2015-08-07 13:42:09 +03:00
|
|
|
set_page_private(page, (unsigned long)ATOMIC_WRITTEN_PAGE);
|
|
|
|
SetPagePrivate(page);
|
|
|
|
|
2014-10-07 04:39:50 +04:00
|
|
|
new = f2fs_kmem_cache_alloc(inmem_entry_slab, GFP_NOFS);
|
|
|
|
|
|
|
|
/* add atomic page indices to the list */
|
|
|
|
new->page = page;
|
|
|
|
INIT_LIST_HEAD(&new->list);
|
2015-08-07 13:42:09 +03:00
|
|
|
|
2014-10-07 04:39:50 +04:00
|
|
|
/* increase reference count with clean state */
|
|
|
|
mutex_lock(&fi->inmem_lock);
|
|
|
|
get_page(page);
|
|
|
|
list_add_tail(&new->list, &fi->inmem_pages);
|
2014-12-06 04:18:15 +03:00
|
|
|
inc_page_count(F2FS_I_SB(inode), F2FS_INMEM_PAGES);
|
2014-10-07 04:39:50 +04:00
|
|
|
mutex_unlock(&fi->inmem_lock);
|
2015-03-18 03:58:08 +03:00
|
|
|
|
|
|
|
trace_f2fs_register_inmem_page(page, INMEM);
|
2014-10-07 04:39:50 +04:00
|
|
|
}
|
|
|
|
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
static int __revoke_inmem_pages(struct inode *inode,
|
|
|
|
struct list_head *head, bool drop, bool recover)
|
2016-02-06 09:38:29 +03:00
|
|
|
{
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
|
2016-02-06 09:38:29 +03:00
|
|
|
struct inmem_pages *cur, *tmp;
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
int err = 0;
|
2016-02-06 09:38:29 +03:00
|
|
|
|
|
|
|
list_for_each_entry_safe(cur, tmp, head, list) {
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
struct page *page = cur->page;
|
|
|
|
|
|
|
|
if (drop)
|
|
|
|
trace_f2fs_commit_inmem_page(page, INMEM_DROP);
|
|
|
|
|
|
|
|
lock_page(page);
|
2016-02-06 09:38:29 +03:00
|
|
|
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
if (recover) {
|
|
|
|
struct dnode_of_data dn;
|
|
|
|
struct node_info ni;
|
|
|
|
|
|
|
|
trace_f2fs_commit_inmem_page(page, INMEM_REVOKE);
|
|
|
|
|
|
|
|
set_new_dnode(&dn, inode, NULL, NULL, 0);
|
|
|
|
if (get_dnode_of_data(&dn, page->index, LOOKUP_NODE)) {
|
|
|
|
err = -EAGAIN;
|
|
|
|
goto next;
|
|
|
|
}
|
|
|
|
get_node_info(sbi, dn.nid, &ni);
|
|
|
|
f2fs_replace_block(sbi, &dn, dn.data_blkaddr,
|
|
|
|
cur->old_addr, ni.version, true, true);
|
|
|
|
f2fs_put_dnode(&dn);
|
|
|
|
}
|
|
|
|
next:
|
2016-04-13 00:11:03 +03:00
|
|
|
/* we don't need to invalidate this in the sccessful status */
|
|
|
|
if (drop || recover)
|
|
|
|
ClearPageUptodate(page);
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
set_page_private(page, 0);
|
2016-04-29 15:13:36 +03:00
|
|
|
ClearPagePrivate(page);
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
f2fs_put_page(page, 1);
|
2016-02-06 09:38:29 +03:00
|
|
|
|
|
|
|
list_del(&cur->list);
|
|
|
|
kmem_cache_free(inmem_entry_slab, cur);
|
|
|
|
dec_page_count(F2FS_I_SB(inode), F2FS_INMEM_PAGES);
|
|
|
|
}
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
return err;
|
2016-02-06 09:38:29 +03:00
|
|
|
}
|
|
|
|
|
|
|
|
void drop_inmem_pages(struct inode *inode)
|
|
|
|
{
|
|
|
|
struct f2fs_inode_info *fi = F2FS_I(inode);
|
|
|
|
|
|
|
|
mutex_lock(&fi->inmem_lock);
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
__revoke_inmem_pages(inode, &fi->inmem_pages, true, false);
|
2016-02-06 09:38:29 +03:00
|
|
|
mutex_unlock(&fi->inmem_lock);
|
2017-01-07 13:50:26 +03:00
|
|
|
|
|
|
|
clear_inode_flag(inode, FI_ATOMIC_FILE);
|
|
|
|
stat_dec_atomic_write(inode);
|
2016-02-06 09:38:29 +03:00
|
|
|
}
|
|
|
|
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
static int __commit_inmem_pages(struct inode *inode,
|
|
|
|
struct list_head *revoke_list)
|
2014-10-07 04:39:50 +04:00
|
|
|
{
|
|
|
|
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
|
|
|
|
struct f2fs_inode_info *fi = F2FS_I(inode);
|
|
|
|
struct inmem_pages *cur, *tmp;
|
|
|
|
struct f2fs_io_info fio = {
|
2015-04-24 00:38:15 +03:00
|
|
|
.sbi = sbi,
|
2014-10-07 04:39:50 +04:00
|
|
|
.type = DATA,
|
2016-06-05 22:31:55 +03:00
|
|
|
.op = REQ_OP_WRITE,
|
2016-11-01 16:40:10 +03:00
|
|
|
.op_flags = REQ_SYNC | REQ_PRIO,
|
2015-04-23 22:04:33 +03:00
|
|
|
.encrypted_page = NULL,
|
2014-10-07 04:39:50 +04:00
|
|
|
};
|
2016-02-06 09:38:29 +03:00
|
|
|
bool submit_bio = false;
|
2015-07-25 10:52:52 +03:00
|
|
|
int err = 0;
|
2014-10-07 04:39:50 +04:00
|
|
|
|
|
|
|
list_for_each_entry_safe(cur, tmp, &fi->inmem_pages, list) {
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
struct page *page = cur->page;
|
|
|
|
|
|
|
|
lock_page(page);
|
|
|
|
if (page->mapping == inode->i_mapping) {
|
|
|
|
trace_f2fs_commit_inmem_page(page, INMEM);
|
|
|
|
|
|
|
|
set_page_dirty(page);
|
|
|
|
f2fs_wait_on_page_writeback(page, DATA, true);
|
2016-10-11 17:57:01 +03:00
|
|
|
if (clear_page_dirty_for_io(page)) {
|
2016-02-06 09:38:29 +03:00
|
|
|
inode_dec_dirty_pages(inode);
|
2016-10-11 17:57:01 +03:00
|
|
|
remove_dirty_inode(inode);
|
|
|
|
}
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
|
|
|
|
fio.page = page;
|
2016-02-06 09:38:29 +03:00
|
|
|
err = do_write_data_page(&fio);
|
|
|
|
if (err) {
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
unlock_page(page);
|
2016-02-06 09:38:29 +03:00
|
|
|
break;
|
2014-12-11 00:59:33 +03:00
|
|
|
}
|
2016-02-06 09:38:29 +03:00
|
|
|
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
/* record old blkaddr for revoking */
|
|
|
|
cur->old_addr = fio.old_blkaddr;
|
2015-08-07 13:42:09 +03:00
|
|
|
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
submit_bio = true;
|
|
|
|
}
|
|
|
|
unlock_page(page);
|
|
|
|
list_move_tail(&cur->list, revoke_list);
|
2014-10-07 04:39:50 +04:00
|
|
|
}
|
2016-02-06 09:38:29 +03:00
|
|
|
|
|
|
|
if (submit_bio)
|
|
|
|
f2fs_submit_merged_bio_cond(sbi, inode, NULL, 0, DATA, WRITE);
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
|
|
|
|
if (!err)
|
|
|
|
__revoke_inmem_pages(inode, revoke_list, false, false);
|
|
|
|
|
2016-02-06 09:38:29 +03:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
int commit_inmem_pages(struct inode *inode)
|
|
|
|
{
|
|
|
|
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
|
|
|
|
struct f2fs_inode_info *fi = F2FS_I(inode);
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
struct list_head revoke_list;
|
|
|
|
int err;
|
2016-02-06 09:38:29 +03:00
|
|
|
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
INIT_LIST_HEAD(&revoke_list);
|
2016-02-06 09:38:29 +03:00
|
|
|
f2fs_balance_fs(sbi, true);
|
|
|
|
f2fs_lock_op(sbi);
|
|
|
|
|
2017-01-07 13:50:26 +03:00
|
|
|
set_inode_flag(inode, FI_ATOMIC_COMMIT);
|
|
|
|
|
2016-02-06 09:38:29 +03:00
|
|
|
mutex_lock(&fi->inmem_lock);
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
err = __commit_inmem_pages(inode, &revoke_list);
|
|
|
|
if (err) {
|
|
|
|
int ret;
|
|
|
|
/*
|
|
|
|
* try to revoke all committed pages, but still we could fail
|
|
|
|
* due to no memory or other reason, if that happened, EAGAIN
|
|
|
|
* will be returned, which means in such case, transaction is
|
|
|
|
* already not integrity, caller should use journal to do the
|
|
|
|
* recovery or rewrite & commit last transaction. For other
|
|
|
|
* error number, revoking was done by filesystem itself.
|
|
|
|
*/
|
|
|
|
ret = __revoke_inmem_pages(inode, &revoke_list, false, true);
|
|
|
|
if (ret)
|
|
|
|
err = ret;
|
|
|
|
|
|
|
|
/* drop all uncommitted pages */
|
|
|
|
__revoke_inmem_pages(inode, &fi->inmem_pages, true, false);
|
|
|
|
}
|
2014-10-07 04:39:50 +04:00
|
|
|
mutex_unlock(&fi->inmem_lock);
|
|
|
|
|
2017-01-07 13:50:26 +03:00
|
|
|
clear_inode_flag(inode, FI_ATOMIC_COMMIT);
|
|
|
|
|
2016-02-06 09:38:29 +03:00
|
|
|
f2fs_unlock_op(sbi);
|
2015-07-25 10:52:52 +03:00
|
|
|
return err;
|
2014-10-07 04:39:50 +04:00
|
|
|
}
|
|
|
|
|
2012-11-29 08:28:09 +04:00
|
|
|
/*
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* This function balances dirty node and dentry pages.
|
|
|
|
* In addition, it controls garbage collection.
|
|
|
|
*/
|
2016-01-08 01:15:04 +03:00
|
|
|
void f2fs_balance_fs(struct f2fs_sb_info *sbi, bool need)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
2016-09-26 14:45:55 +03:00
|
|
|
#ifdef CONFIG_F2FS_FAULT_INJECTION
|
|
|
|
if (time_to_inject(sbi, FAULT_CHECKPOINT))
|
|
|
|
f2fs_stop_checkpoint(sbi, false);
|
|
|
|
#endif
|
|
|
|
|
2016-01-08 01:15:04 +03:00
|
|
|
if (!need)
|
|
|
|
return;
|
2016-06-03 01:24:24 +03:00
|
|
|
|
|
|
|
/* balance_fs_bg is able to be pending */
|
|
|
|
if (excess_cached_nats(sbi))
|
|
|
|
f2fs_balance_fs_bg(sbi);
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
/*
|
2012-12-21 12:20:21 +04:00
|
|
|
* We should do GC or end up with checkpoint, if there are so many dirty
|
|
|
|
* dir/node pages without enough free segments.
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
*/
|
2016-09-01 22:02:51 +03:00
|
|
|
if (has_not_enough_free_secs(sbi, 0, 0)) {
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
mutex_lock(&sbi->gc_mutex);
|
2016-11-15 04:38:35 +03:00
|
|
|
f2fs_gc(sbi, false, false);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-10-24 09:19:18 +04:00
|
|
|
void f2fs_balance_fs_bg(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
f2fs: enable rb-tree extent cache
This patch enables rb-tree based extent cache in f2fs.
When we mount with "-o extent_cache", f2fs will try to add recently accessed
page-block mappings into rb-tree based extent cache as much as possible, instead
of original one extent info cache.
By this way, f2fs can support more effective cache between dnode page cache and
disk. It will supply high hit ratio in the cache with fewer memory when dnode
page cache are reclaimed in environment of low memory.
Storage: Sandisk sd card 64g
1.append write file (offset: 0, size: 128M);
2.override write file (offset: 2M, size: 1M);
3.override write file (offset: 4M, size: 1M);
...
4.override write file (offset: 48M, size: 1M);
...
5.override write file (offset: 112M, size: 1M);
6.sync
7.echo 3 > /proc/sys/vm/drop_caches
8.read file (size:128M, unit: 4k, count: 32768)
(time dd if=/mnt/f2fs/128m bs=4k count=32768)
Extent Hit Ratio:
before patched
Hit Ratio 121 / 1071 1071 / 1071
Performance:
before patched
real 0m37.051s 0m35.556s
user 0m0.040s 0m0.026s
sys 0m2.990s 0m2.251s
Memory Cost:
before patched
Tree Count: 0 1 (size: 24 bytes)
Node Count: 0 45 (size: 1440 bytes)
v3:
o retest and given more details of test result.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-05 12:57:31 +03:00
|
|
|
/* try to shrink extent cache when there is no enough memory */
|
2015-06-19 23:41:23 +03:00
|
|
|
if (!available_free_memory(sbi, EXTENT_CACHE))
|
|
|
|
f2fs_shrink_extent_tree(sbi, EXTENT_CACHE_SHRINK_NUMBER);
|
f2fs: enable rb-tree extent cache
This patch enables rb-tree based extent cache in f2fs.
When we mount with "-o extent_cache", f2fs will try to add recently accessed
page-block mappings into rb-tree based extent cache as much as possible, instead
of original one extent info cache.
By this way, f2fs can support more effective cache between dnode page cache and
disk. It will supply high hit ratio in the cache with fewer memory when dnode
page cache are reclaimed in environment of low memory.
Storage: Sandisk sd card 64g
1.append write file (offset: 0, size: 128M);
2.override write file (offset: 2M, size: 1M);
3.override write file (offset: 4M, size: 1M);
...
4.override write file (offset: 48M, size: 1M);
...
5.override write file (offset: 112M, size: 1M);
6.sync
7.echo 3 > /proc/sys/vm/drop_caches
8.read file (size:128M, unit: 4k, count: 32768)
(time dd if=/mnt/f2fs/128m bs=4k count=32768)
Extent Hit Ratio:
before patched
Hit Ratio 121 / 1071 1071 / 1071
Performance:
before patched
real 0m37.051s 0m35.556s
user 0m0.040s 0m0.026s
sys 0m2.990s 0m2.251s
Memory Cost:
before patched
Tree Count: 0 1 (size: 24 bytes)
Node Count: 0 45 (size: 1440 bytes)
v3:
o retest and given more details of test result.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-02-05 12:57:31 +03:00
|
|
|
|
2015-06-20 01:36:07 +03:00
|
|
|
/* check the # of cached NAT entries */
|
|
|
|
if (!available_free_memory(sbi, NAT_ENTRIES))
|
|
|
|
try_to_free_nats(sbi, NAT_ENTRY_PER_BLOCK);
|
|
|
|
|
2015-07-28 13:33:46 +03:00
|
|
|
if (!available_free_memory(sbi, FREE_NIDS))
|
2016-06-17 02:41:49 +03:00
|
|
|
try_to_free_nids(sbi, MAX_FREE_NIDS);
|
|
|
|
else
|
2016-10-11 17:31:35 +03:00
|
|
|
build_free_nids(sbi, false);
|
2015-07-28 13:33:46 +03:00
|
|
|
|
2016-12-05 22:37:14 +03:00
|
|
|
if (!is_idle(sbi))
|
|
|
|
return;
|
2015-07-28 13:33:46 +03:00
|
|
|
|
2015-06-20 01:36:07 +03:00
|
|
|
/* checkpoint is the only way to shrink partial cached entries */
|
|
|
|
if (!available_free_memory(sbi, NAT_ENTRIES) ||
|
2015-10-06 00:49:57 +03:00
|
|
|
!available_free_memory(sbi, INO_ENTRIES) ||
|
f2fs: flush dirty nat entries when exceeding threshold
When testing f2fs with xfstest, generic/251 is stuck for long time,
the case uses below serials to obtain fresh released space in device,
in order to prepare for following fstrim test.
1. rm -rf /mnt/dir
2. mkdir /mnt/dir/
3. cp -axT `pwd`/ /mnt/dir/
4. goto 1
During preparing step, all nat entries will be cached in nat cache,
most of them are dirty entries with invalid blkaddr, which means
nodes related to these entries have been truncated, and they could
be reused after the dirty entries been checkpointed.
However, there was no checkpoint been triggered, so nid allocators
(e.g. mkdir, creat) will run into long journey of iterating all NAT
pages, looking for free nids in alloc_nid->build_free_nids.
Here, in f2fs_balance_fs_bg we give another chance to do checkpoint
to flush nat entries for reusing them in free nid cache when dirty
entry count exceeds 10% of max count.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-01-18 13:31:18 +03:00
|
|
|
excess_prefree_segs(sbi) ||
|
|
|
|
excess_dirty_nats(sbi) ||
|
2016-12-05 22:37:14 +03:00
|
|
|
f2fs_time_over(sbi, CP_TIME)) {
|
2016-02-14 13:54:33 +03:00
|
|
|
if (test_opt(sbi, DATA_FLUSH)) {
|
|
|
|
struct blk_plug plug;
|
|
|
|
|
|
|
|
blk_start_plug(&plug);
|
f2fs: support data flush in background
Previously, when finishing a checkpoint, we have persisted all fs meta
info including meta inode, node inode, dentry page of directory inode, so,
after a sudden power cut, f2fs can recover from last checkpoint with full
directory structure.
But during checkpoint, we didn't flush dirty pages of regular and symlink
inode, so such dirty datas still in memory will be lost in that moment of
power off.
In order to reduce the chance of lost data, this patch enables
f2fs_balance_fs_bg with the ability of data flushing. It will try to flush
user data before starting a checkpoint. So user's data written after last
checkpoint which may not be fsynced could be saved.
When we mount with data_flush option, after every period of cp_interval
(could be configured in sysfs: /sys/fs/f2fs/device/cp_interval) seconds
user data could be flushed into device once f2fs_balance_fs_bg was called
in kworker thread or gc thread.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-12-17 12:13:28 +03:00
|
|
|
sync_dirty_inodes(sbi, FILE_INODE);
|
2016-02-14 13:54:33 +03:00
|
|
|
blk_finish_plug(&plug);
|
|
|
|
}
|
2013-10-24 09:19:18 +04:00
|
|
|
f2fs_sync_fs(sbi->sb, true);
|
2016-01-10 00:45:17 +03:00
|
|
|
stat_inc_bg_cp_count(sbi->stat_info);
|
f2fs: support data flush in background
Previously, when finishing a checkpoint, we have persisted all fs meta
info including meta inode, node inode, dentry page of directory inode, so,
after a sudden power cut, f2fs can recover from last checkpoint with full
directory structure.
But during checkpoint, we didn't flush dirty pages of regular and symlink
inode, so such dirty datas still in memory will be lost in that moment of
power off.
In order to reduce the chance of lost data, this patch enables
f2fs_balance_fs_bg with the ability of data flushing. It will try to flush
user data before starting a checkpoint. So user's data written after last
checkpoint which may not be fsynced could be saved.
When we mount with data_flush option, after every period of cp_interval
(could be configured in sysfs: /sys/fs/f2fs/device/cp_interval) seconds
user data could be flushed into device once f2fs_balance_fs_bg was called
in kworker thread or gc thread.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2015-12-17 12:13:28 +03:00
|
|
|
}
|
2013-10-24 09:19:18 +04:00
|
|
|
}
|
|
|
|
|
2016-10-07 05:02:05 +03:00
|
|
|
static int __submit_flush_wait(struct block_device *bdev)
|
|
|
|
{
|
|
|
|
struct bio *bio = f2fs_bio_alloc(0);
|
|
|
|
int ret;
|
|
|
|
|
2016-12-14 20:07:36 +03:00
|
|
|
bio->bi_opf = REQ_OP_WRITE | REQ_PREFLUSH;
|
2016-10-07 05:02:05 +03:00
|
|
|
bio->bi_bdev = bdev;
|
|
|
|
ret = submit_bio_wait(bio);
|
|
|
|
bio_put(bio);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int submit_flush_wait(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
int ret = __submit_flush_wait(sbi->sb->s_bdev);
|
|
|
|
int i;
|
|
|
|
|
|
|
|
if (sbi->s_ndevs && !ret) {
|
|
|
|
for (i = 1; i < sbi->s_ndevs; i++) {
|
|
|
|
ret = __submit_flush_wait(FDEV(i).bdev);
|
|
|
|
if (ret)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2014-04-27 10:21:33 +04:00
|
|
|
static int issue_flush_thread(void *data)
|
2014-04-02 10:34:36 +04:00
|
|
|
{
|
|
|
|
struct f2fs_sb_info *sbi = data;
|
2014-04-27 10:21:21 +04:00
|
|
|
struct flush_cmd_control *fcc = SM_I(sbi)->cmd_control_info;
|
|
|
|
wait_queue_head_t *q = &fcc->flush_wait_queue;
|
2014-04-02 10:34:36 +04:00
|
|
|
repeat:
|
|
|
|
if (kthread_should_stop())
|
|
|
|
return 0;
|
|
|
|
|
2014-09-05 14:31:00 +04:00
|
|
|
if (!llist_empty(&fcc->issue_list)) {
|
2014-04-02 10:34:36 +04:00
|
|
|
struct flush_cmd *cmd, *next;
|
|
|
|
int ret;
|
|
|
|
|
2014-09-05 14:31:00 +04:00
|
|
|
fcc->dispatch_list = llist_del_all(&fcc->issue_list);
|
|
|
|
fcc->dispatch_list = llist_reverse_order(fcc->dispatch_list);
|
|
|
|
|
2016-10-07 05:02:05 +03:00
|
|
|
ret = submit_flush_wait(sbi);
|
2014-09-05 14:31:00 +04:00
|
|
|
llist_for_each_entry_safe(cmd, next,
|
|
|
|
fcc->dispatch_list, llnode) {
|
2014-04-02 10:34:36 +04:00
|
|
|
cmd->ret = ret;
|
|
|
|
complete(&cmd->wait);
|
|
|
|
}
|
2014-04-27 10:21:21 +04:00
|
|
|
fcc->dispatch_list = NULL;
|
2014-04-02 10:34:36 +04:00
|
|
|
}
|
|
|
|
|
2014-04-27 10:21:21 +04:00
|
|
|
wait_event_interruptible(*q,
|
2014-09-05 14:31:00 +04:00
|
|
|
kthread_should_stop() || !llist_empty(&fcc->issue_list));
|
2014-04-02 10:34:36 +04:00
|
|
|
goto repeat;
|
|
|
|
}
|
|
|
|
|
|
|
|
int f2fs_issue_flush(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
2014-04-27 10:21:21 +04:00
|
|
|
struct flush_cmd_control *fcc = SM_I(sbi)->cmd_control_info;
|
2014-05-08 13:00:35 +04:00
|
|
|
struct flush_cmd cmd;
|
2014-04-02 10:34:36 +04:00
|
|
|
|
2014-07-26 04:46:10 +04:00
|
|
|
trace_f2fs_issue_flush(sbi->sb, test_opt(sbi, NOBARRIER),
|
|
|
|
test_opt(sbi, FLUSH_MERGE));
|
|
|
|
|
2014-07-23 20:57:31 +04:00
|
|
|
if (test_opt(sbi, NOBARRIER))
|
|
|
|
return 0;
|
|
|
|
|
2016-05-23 22:04:56 +03:00
|
|
|
if (!test_opt(sbi, FLUSH_MERGE) || !atomic_read(&fcc->submit_flush)) {
|
2015-08-14 21:43:56 +03:00
|
|
|
int ret;
|
|
|
|
|
2016-05-23 22:04:56 +03:00
|
|
|
atomic_inc(&fcc->submit_flush);
|
2016-10-07 05:02:05 +03:00
|
|
|
ret = submit_flush_wait(sbi);
|
2016-05-23 22:04:56 +03:00
|
|
|
atomic_dec(&fcc->submit_flush);
|
2015-08-14 21:43:56 +03:00
|
|
|
return ret;
|
|
|
|
}
|
2014-04-02 10:34:36 +04:00
|
|
|
|
2014-05-08 13:00:35 +04:00
|
|
|
init_completion(&cmd.wait);
|
2014-04-02 10:34:36 +04:00
|
|
|
|
2016-05-23 22:04:56 +03:00
|
|
|
atomic_inc(&fcc->submit_flush);
|
2014-09-05 14:31:00 +04:00
|
|
|
llist_add(&cmd.llnode, &fcc->issue_list);
|
2014-04-02 10:34:36 +04:00
|
|
|
|
2014-04-27 10:21:21 +04:00
|
|
|
if (!fcc->dispatch_list)
|
|
|
|
wake_up(&fcc->flush_wait_queue);
|
2014-04-02 10:34:36 +04:00
|
|
|
|
2016-12-08 03:23:32 +03:00
|
|
|
if (fcc->f2fs_issue_flush) {
|
|
|
|
wait_for_completion(&cmd.wait);
|
|
|
|
atomic_dec(&fcc->submit_flush);
|
|
|
|
} else {
|
|
|
|
llist_del_all(&fcc->issue_list);
|
|
|
|
atomic_set(&fcc->submit_flush, 0);
|
|
|
|
}
|
2014-05-08 13:00:35 +04:00
|
|
|
|
|
|
|
return cmd.ret;
|
2014-04-02 10:34:36 +04:00
|
|
|
}
|
|
|
|
|
2014-04-27 10:21:33 +04:00
|
|
|
int create_flush_cmd_control(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
dev_t dev = sbi->sb->s_bdev->bd_dev;
|
|
|
|
struct flush_cmd_control *fcc;
|
|
|
|
int err = 0;
|
|
|
|
|
2016-12-08 03:23:32 +03:00
|
|
|
if (SM_I(sbi)->cmd_control_info) {
|
|
|
|
fcc = SM_I(sbi)->cmd_control_info;
|
|
|
|
goto init_thread;
|
|
|
|
}
|
|
|
|
|
2014-04-27 10:21:33 +04:00
|
|
|
fcc = kzalloc(sizeof(struct flush_cmd_control), GFP_KERNEL);
|
|
|
|
if (!fcc)
|
|
|
|
return -ENOMEM;
|
2016-05-23 22:04:56 +03:00
|
|
|
atomic_set(&fcc->submit_flush, 0);
|
2014-04-27 10:21:33 +04:00
|
|
|
init_waitqueue_head(&fcc->flush_wait_queue);
|
2014-09-05 14:31:00 +04:00
|
|
|
init_llist_head(&fcc->issue_list);
|
2014-07-07 07:21:59 +04:00
|
|
|
SM_I(sbi)->cmd_control_info = fcc;
|
2016-12-08 03:23:32 +03:00
|
|
|
init_thread:
|
2014-04-27 10:21:33 +04:00
|
|
|
fcc->f2fs_issue_flush = kthread_run(issue_flush_thread, sbi,
|
|
|
|
"f2fs_flush-%u:%u", MAJOR(dev), MINOR(dev));
|
|
|
|
if (IS_ERR(fcc->f2fs_issue_flush)) {
|
|
|
|
err = PTR_ERR(fcc->f2fs_issue_flush);
|
|
|
|
kfree(fcc);
|
2014-07-07 07:21:59 +04:00
|
|
|
SM_I(sbi)->cmd_control_info = NULL;
|
2014-04-27 10:21:33 +04:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2016-12-08 03:23:32 +03:00
|
|
|
void destroy_flush_cmd_control(struct f2fs_sb_info *sbi, bool free)
|
2014-04-27 10:21:33 +04:00
|
|
|
{
|
2014-07-07 07:21:59 +04:00
|
|
|
struct flush_cmd_control *fcc = SM_I(sbi)->cmd_control_info;
|
2014-04-27 10:21:33 +04:00
|
|
|
|
2016-12-08 03:23:32 +03:00
|
|
|
if (fcc && fcc->f2fs_issue_flush) {
|
|
|
|
struct task_struct *flush_thread = fcc->f2fs_issue_flush;
|
|
|
|
|
|
|
|
fcc->f2fs_issue_flush = NULL;
|
|
|
|
kthread_stop(flush_thread);
|
|
|
|
}
|
|
|
|
if (free) {
|
|
|
|
kfree(fcc);
|
|
|
|
SM_I(sbi)->cmd_control_info = NULL;
|
|
|
|
}
|
2014-04-27 10:21:33 +04:00
|
|
|
}
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
static void __locate_dirty_segment(struct f2fs_sb_info *sbi, unsigned int segno,
|
|
|
|
enum dirty_type dirty_type)
|
|
|
|
{
|
|
|
|
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
|
|
|
|
|
|
|
|
/* need not be added */
|
|
|
|
if (IS_CURSEG(sbi, segno))
|
|
|
|
return;
|
|
|
|
|
|
|
|
if (!test_and_set_bit(segno, dirty_i->dirty_segmap[dirty_type]))
|
|
|
|
dirty_i->nr_dirty[dirty_type]++;
|
|
|
|
|
|
|
|
if (dirty_type == DIRTY) {
|
|
|
|
struct seg_entry *sentry = get_seg_entry(sbi, segno);
|
2013-10-25 12:31:57 +04:00
|
|
|
enum dirty_type t = sentry->type;
|
f2fs: fix the bitmap consistency of dirty segments
Like below, there are 8 segment bitmaps for SSR victim candidates.
enum dirty_type {
DIRTY_HOT_DATA, /* dirty segments assigned as hot data logs */
DIRTY_WARM_DATA, /* dirty segments assigned as warm data logs */
DIRTY_COLD_DATA, /* dirty segments assigned as cold data logs */
DIRTY_HOT_NODE, /* dirty segments assigned as hot node logs */
DIRTY_WARM_NODE, /* dirty segments assigned as warm node logs */
DIRTY_COLD_NODE, /* dirty segments assigned as cold node logs */
DIRTY, /* to count # of dirty segments */
PRE, /* to count # of entirely obsolete segments */
NR_DIRTY_TYPE
};
The upper 6 bitmaps indicates segments dirtied by active log areas respectively.
And, the DIRTY bitmap integrates all the 6 bitmaps.
For example,
o DIRTY_HOT_DATA : 1010000
o DIRTY_WARM_DATA: 0100000
o DIRTY_COLD_DATA: 0001000
o DIRTY_HOT_NODE : 0000010
o DIRTY_WARM_NODE: 0000001
o DIRTY_COLD_NODE: 0000000
In this case,
o DIRTY : 1111011,
which means that we should guarantee the consistency between DIRTY and other
bitmaps concreately.
However, the SSR mode selects victims freely from any log types, which can set
multiple bits across the various bitmap types.
So, this patch eliminates this inconsistency.
Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-01 08:52:09 +04:00
|
|
|
|
2014-09-03 03:24:11 +04:00
|
|
|
if (unlikely(t >= DIRTY)) {
|
|
|
|
f2fs_bug_on(sbi, 1);
|
|
|
|
return;
|
|
|
|
}
|
2013-10-25 12:31:57 +04:00
|
|
|
if (!test_and_set_bit(segno, dirty_i->dirty_segmap[t]))
|
|
|
|
dirty_i->nr_dirty[t]++;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void __remove_dirty_segment(struct f2fs_sb_info *sbi, unsigned int segno,
|
|
|
|
enum dirty_type dirty_type)
|
|
|
|
{
|
|
|
|
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
|
|
|
|
|
|
|
|
if (test_and_clear_bit(segno, dirty_i->dirty_segmap[dirty_type]))
|
|
|
|
dirty_i->nr_dirty[dirty_type]--;
|
|
|
|
|
|
|
|
if (dirty_type == DIRTY) {
|
2013-10-25 12:31:57 +04:00
|
|
|
struct seg_entry *sentry = get_seg_entry(sbi, segno);
|
|
|
|
enum dirty_type t = sentry->type;
|
|
|
|
|
|
|
|
if (test_and_clear_bit(segno, dirty_i->dirty_segmap[t]))
|
|
|
|
dirty_i->nr_dirty[t]--;
|
f2fs: fix the bitmap consistency of dirty segments
Like below, there are 8 segment bitmaps for SSR victim candidates.
enum dirty_type {
DIRTY_HOT_DATA, /* dirty segments assigned as hot data logs */
DIRTY_WARM_DATA, /* dirty segments assigned as warm data logs */
DIRTY_COLD_DATA, /* dirty segments assigned as cold data logs */
DIRTY_HOT_NODE, /* dirty segments assigned as hot node logs */
DIRTY_WARM_NODE, /* dirty segments assigned as warm node logs */
DIRTY_COLD_NODE, /* dirty segments assigned as cold node logs */
DIRTY, /* to count # of dirty segments */
PRE, /* to count # of entirely obsolete segments */
NR_DIRTY_TYPE
};
The upper 6 bitmaps indicates segments dirtied by active log areas respectively.
And, the DIRTY bitmap integrates all the 6 bitmaps.
For example,
o DIRTY_HOT_DATA : 1010000
o DIRTY_WARM_DATA: 0100000
o DIRTY_COLD_DATA: 0001000
o DIRTY_HOT_NODE : 0000010
o DIRTY_WARM_NODE: 0000001
o DIRTY_COLD_NODE: 0000000
In this case,
o DIRTY : 1111011,
which means that we should guarantee the consistency between DIRTY and other
bitmaps concreately.
However, the SSR mode selects victims freely from any log types, which can set
multiple bits across the various bitmap types.
So, this patch eliminates this inconsistency.
Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-01 08:52:09 +04:00
|
|
|
|
2013-03-31 08:26:03 +04:00
|
|
|
if (get_valid_blocks(sbi, segno, sbi->segs_per_sec) == 0)
|
|
|
|
clear_bit(GET_SECNO(sbi, segno),
|
|
|
|
dirty_i->victim_secmap);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2012-11-29 08:28:09 +04:00
|
|
|
/*
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* Should not occur error such as -ENOMEM.
|
|
|
|
* Adding dirty entry into seglist is not critical operation.
|
|
|
|
* If a given segment is one of current working segments, it won't be added.
|
|
|
|
*/
|
2013-06-13 12:59:28 +04:00
|
|
|
static void locate_dirty_segment(struct f2fs_sb_info *sbi, unsigned int segno)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
|
|
|
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
|
|
|
|
unsigned short valid_blocks;
|
|
|
|
|
|
|
|
if (segno == NULL_SEGNO || IS_CURSEG(sbi, segno))
|
|
|
|
return;
|
|
|
|
|
|
|
|
mutex_lock(&dirty_i->seglist_lock);
|
|
|
|
|
|
|
|
valid_blocks = get_valid_blocks(sbi, segno, 0);
|
|
|
|
|
|
|
|
if (valid_blocks == 0) {
|
|
|
|
__locate_dirty_segment(sbi, segno, PRE);
|
|
|
|
__remove_dirty_segment(sbi, segno, DIRTY);
|
|
|
|
} else if (valid_blocks < sbi->blocks_per_seg) {
|
|
|
|
__locate_dirty_segment(sbi, segno, DIRTY);
|
|
|
|
} else {
|
|
|
|
/* Recovery routine with SSR needs this */
|
|
|
|
__remove_dirty_segment(sbi, segno, DIRTY);
|
|
|
|
}
|
|
|
|
|
|
|
|
mutex_unlock(&dirty_i->seglist_lock);
|
|
|
|
}
|
|
|
|
|
2016-08-29 18:58:34 +03:00
|
|
|
static struct bio_entry *__add_bio_entry(struct f2fs_sb_info *sbi,
|
2016-12-30 01:07:53 +03:00
|
|
|
struct bio *bio, block_t lstart, block_t len)
|
2016-08-29 18:58:34 +03:00
|
|
|
{
|
|
|
|
struct list_head *wait_list = &(SM_I(sbi)->wait_list);
|
|
|
|
struct bio_entry *be = f2fs_kmem_cache_alloc(bio_entry_slab, GFP_NOFS);
|
|
|
|
|
|
|
|
INIT_LIST_HEAD(&be->list);
|
|
|
|
be->bio = bio;
|
2016-12-30 01:07:53 +03:00
|
|
|
be->lstart = lstart;
|
|
|
|
be->len = len;
|
2016-08-29 18:58:34 +03:00
|
|
|
init_completion(&be->event);
|
|
|
|
list_add_tail(&be->list, wait_list);
|
|
|
|
|
|
|
|
return be;
|
|
|
|
}
|
|
|
|
|
2016-12-30 01:07:53 +03:00
|
|
|
/* This should be covered by global mutex, &sit_i->sentry_lock */
|
|
|
|
void f2fs_wait_discard_bio(struct f2fs_sb_info *sbi, block_t blkaddr)
|
2016-08-29 18:58:34 +03:00
|
|
|
{
|
|
|
|
struct list_head *wait_list = &(SM_I(sbi)->wait_list);
|
|
|
|
struct bio_entry *be, *tmp;
|
|
|
|
|
|
|
|
list_for_each_entry_safe(be, tmp, wait_list, list) {
|
|
|
|
struct bio *bio = be->bio;
|
|
|
|
int err;
|
|
|
|
|
2016-12-30 01:07:53 +03:00
|
|
|
if (!completion_done(&be->event)) {
|
|
|
|
if ((be->lstart <= blkaddr &&
|
|
|
|
blkaddr < be->lstart + be->len) ||
|
|
|
|
blkaddr == NULL_ADDR)
|
|
|
|
wait_for_completion_io(&be->event);
|
|
|
|
else
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
2016-08-29 18:58:34 +03:00
|
|
|
err = be->error;
|
|
|
|
if (err == -EOPNOTSUPP)
|
|
|
|
err = 0;
|
|
|
|
|
|
|
|
if (err)
|
|
|
|
f2fs_msg(sbi->sb, KERN_INFO,
|
|
|
|
"Issue discard failed, ret: %d", err);
|
|
|
|
|
|
|
|
bio_put(bio);
|
|
|
|
list_del(&be->list);
|
|
|
|
kmem_cache_free(bio_entry_slab, be);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void f2fs_submit_bio_wait_endio(struct bio *bio)
|
|
|
|
{
|
|
|
|
struct bio_entry *be = (struct bio_entry *)bio->bi_private;
|
|
|
|
|
|
|
|
be->error = bio->bi_error;
|
|
|
|
complete(&be->event);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* this function is copied from blkdev_issue_discard from block/blk-lib.c */
|
2016-10-28 11:45:06 +03:00
|
|
|
static int __f2fs_issue_discard_async(struct f2fs_sb_info *sbi,
|
2016-10-07 05:02:05 +03:00
|
|
|
struct block_device *bdev, block_t blkstart, block_t blklen)
|
2016-08-29 18:58:34 +03:00
|
|
|
{
|
|
|
|
struct bio *bio = NULL;
|
2016-12-30 01:07:53 +03:00
|
|
|
block_t lblkstart = blkstart;
|
2016-08-29 18:58:34 +03:00
|
|
|
int err;
|
|
|
|
|
2016-10-28 11:45:06 +03:00
|
|
|
trace_f2fs_issue_discard(sbi->sb, blkstart, blklen);
|
|
|
|
|
2016-10-07 05:02:05 +03:00
|
|
|
if (sbi->s_ndevs) {
|
|
|
|
int devi = f2fs_target_device_index(sbi, blkstart);
|
|
|
|
|
|
|
|
blkstart -= FDEV(devi).start_blk;
|
|
|
|
}
|
2016-10-28 11:45:06 +03:00
|
|
|
err = __blkdev_issue_discard(bdev,
|
|
|
|
SECTOR_FROM_BLOCK(blkstart),
|
|
|
|
SECTOR_FROM_BLOCK(blklen),
|
|
|
|
GFP_NOFS, 0, &bio);
|
2016-08-29 18:58:34 +03:00
|
|
|
if (!err && bio) {
|
2016-12-30 01:07:53 +03:00
|
|
|
struct bio_entry *be = __add_bio_entry(sbi, bio,
|
|
|
|
lblkstart, blklen);
|
2016-08-29 18:58:34 +03:00
|
|
|
|
|
|
|
bio->bi_private = be;
|
|
|
|
bio->bi_end_io = f2fs_submit_bio_wait_endio;
|
|
|
|
bio->bi_opf |= REQ_SYNC;
|
|
|
|
submit_bio(bio);
|
|
|
|
}
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2016-10-28 11:45:06 +03:00
|
|
|
#ifdef CONFIG_BLK_DEV_ZONED
|
2016-10-07 05:02:05 +03:00
|
|
|
static int __f2fs_issue_discard_zone(struct f2fs_sb_info *sbi,
|
|
|
|
struct block_device *bdev, block_t blkstart, block_t blklen)
|
2016-10-28 11:45:06 +03:00
|
|
|
{
|
|
|
|
sector_t nr_sects = SECTOR_FROM_BLOCK(blklen);
|
2016-10-07 05:02:05 +03:00
|
|
|
sector_t sector;
|
|
|
|
int devi = 0;
|
|
|
|
|
|
|
|
if (sbi->s_ndevs) {
|
|
|
|
devi = f2fs_target_device_index(sbi, blkstart);
|
|
|
|
blkstart -= FDEV(devi).start_blk;
|
|
|
|
}
|
|
|
|
sector = SECTOR_FROM_BLOCK(blkstart);
|
2016-10-28 11:45:06 +03:00
|
|
|
|
2017-01-12 17:58:32 +03:00
|
|
|
if (sector & (bdev_zone_sectors(bdev) - 1) ||
|
|
|
|
nr_sects != bdev_zone_sectors(bdev)) {
|
2016-10-28 11:45:06 +03:00
|
|
|
f2fs_msg(sbi->sb, KERN_INFO,
|
2016-10-07 05:02:05 +03:00
|
|
|
"(%d) %s: Unaligned discard attempted (block %x + %x)",
|
|
|
|
devi, sbi->s_ndevs ? FDEV(devi).path: "",
|
|
|
|
blkstart, blklen);
|
2016-10-28 11:45:06 +03:00
|
|
|
return -EIO;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We need to know the type of the zone: for conventional zones,
|
|
|
|
* use regular discard if the drive supports it. For sequential
|
|
|
|
* zones, reset the zone write pointer.
|
|
|
|
*/
|
2016-10-07 05:02:05 +03:00
|
|
|
switch (get_blkz_type(sbi, bdev, blkstart)) {
|
2016-10-28 11:45:06 +03:00
|
|
|
|
|
|
|
case BLK_ZONE_TYPE_CONVENTIONAL:
|
|
|
|
if (!blk_queue_discard(bdev_get_queue(bdev)))
|
|
|
|
return 0;
|
2016-10-07 05:02:05 +03:00
|
|
|
return __f2fs_issue_discard_async(sbi, bdev, blkstart, blklen);
|
2016-10-28 11:45:06 +03:00
|
|
|
case BLK_ZONE_TYPE_SEQWRITE_REQ:
|
|
|
|
case BLK_ZONE_TYPE_SEQWRITE_PREF:
|
2016-10-28 11:45:07 +03:00
|
|
|
trace_f2fs_issue_reset_zone(sbi->sb, blkstart);
|
2016-10-28 11:45:06 +03:00
|
|
|
return blkdev_reset_zones(bdev, sector,
|
|
|
|
nr_sects, GFP_NOFS);
|
|
|
|
default:
|
|
|
|
/* Unknown zone type: broken device ? */
|
|
|
|
return -EIO;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2016-10-07 05:02:05 +03:00
|
|
|
static int __issue_discard_async(struct f2fs_sb_info *sbi,
|
|
|
|
struct block_device *bdev, block_t blkstart, block_t blklen)
|
|
|
|
{
|
|
|
|
#ifdef CONFIG_BLK_DEV_ZONED
|
|
|
|
if (f2fs_sb_mounted_blkzoned(sbi->sb) &&
|
|
|
|
bdev_zoned_model(bdev) != BLK_ZONED_NONE)
|
|
|
|
return __f2fs_issue_discard_zone(sbi, bdev, blkstart, blklen);
|
|
|
|
#endif
|
|
|
|
return __f2fs_issue_discard_async(sbi, bdev, blkstart, blklen);
|
|
|
|
}
|
|
|
|
|
2014-04-15 08:57:55 +04:00
|
|
|
static int f2fs_issue_discard(struct f2fs_sb_info *sbi,
|
2013-11-12 11:55:17 +04:00
|
|
|
block_t blkstart, block_t blklen)
|
|
|
|
{
|
2016-10-07 05:02:05 +03:00
|
|
|
sector_t start = blkstart, len = 0;
|
|
|
|
struct block_device *bdev;
|
2015-05-01 08:37:50 +03:00
|
|
|
struct seg_entry *se;
|
|
|
|
unsigned int offset;
|
|
|
|
block_t i;
|
2016-10-07 05:02:05 +03:00
|
|
|
int err = 0;
|
|
|
|
|
|
|
|
bdev = f2fs_target_device(sbi, blkstart, NULL);
|
|
|
|
|
|
|
|
for (i = blkstart; i < blkstart + blklen; i++, len++) {
|
|
|
|
if (i != start) {
|
|
|
|
struct block_device *bdev2 =
|
|
|
|
f2fs_target_device(sbi, i, NULL);
|
|
|
|
|
|
|
|
if (bdev2 != bdev) {
|
|
|
|
err = __issue_discard_async(sbi, bdev,
|
|
|
|
start, len);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
bdev = bdev2;
|
|
|
|
start = i;
|
|
|
|
len = 0;
|
|
|
|
}
|
|
|
|
}
|
2015-05-01 08:37:50 +03:00
|
|
|
|
|
|
|
se = get_seg_entry(sbi, GET_SEGNO(sbi, i));
|
|
|
|
offset = GET_BLKOFF_FROM_SEG0(sbi, i);
|
|
|
|
|
|
|
|
if (!f2fs_test_and_set_bit(offset, se->discard_map))
|
|
|
|
sbi->discard_blks--;
|
|
|
|
}
|
2016-10-28 11:45:06 +03:00
|
|
|
|
2016-10-07 05:02:05 +03:00
|
|
|
if (len)
|
|
|
|
err = __issue_discard_async(sbi, bdev, start, len);
|
|
|
|
return err;
|
2014-04-15 08:57:55 +04:00
|
|
|
}
|
|
|
|
|
2014-10-29 08:27:59 +03:00
|
|
|
static void __add_discard_entry(struct f2fs_sb_info *sbi,
|
2015-05-01 08:37:50 +03:00
|
|
|
struct cp_control *cpc, struct seg_entry *se,
|
|
|
|
unsigned int start, unsigned int end)
|
2013-11-12 09:49:56 +04:00
|
|
|
{
|
|
|
|
struct list_head *head = &SM_I(sbi)->discard_list;
|
2014-10-29 08:27:59 +03:00
|
|
|
struct discard_entry *new, *last;
|
|
|
|
|
|
|
|
if (!list_empty(head)) {
|
|
|
|
last = list_last_entry(head, struct discard_entry, list);
|
|
|
|
if (START_BLOCK(sbi, cpc->trim_start) + start ==
|
|
|
|
last->blkaddr + last->len) {
|
|
|
|
last->len += end - start;
|
|
|
|
goto done;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
new = f2fs_kmem_cache_alloc(discard_entry_slab, GFP_NOFS);
|
|
|
|
INIT_LIST_HEAD(&new->list);
|
|
|
|
new->blkaddr = START_BLOCK(sbi, cpc->trim_start) + start;
|
|
|
|
new->len = end - start;
|
|
|
|
list_add_tail(&new->list, head);
|
|
|
|
done:
|
|
|
|
SM_I(sbi)->nr_discards += end - start;
|
|
|
|
}
|
|
|
|
|
2016-12-30 09:06:15 +03:00
|
|
|
static bool add_discard_addrs(struct f2fs_sb_info *sbi, struct cp_control *cpc,
|
|
|
|
bool check_only)
|
2014-10-29 08:27:59 +03:00
|
|
|
{
|
2013-11-12 09:49:56 +04:00
|
|
|
int entries = SIT_VBLOCK_MAP_SIZE / sizeof(unsigned long);
|
|
|
|
int max_blocks = sbi->blocks_per_seg;
|
2014-09-21 09:06:39 +04:00
|
|
|
struct seg_entry *se = get_seg_entry(sbi, cpc->trim_start);
|
2013-11-12 09:49:56 +04:00
|
|
|
unsigned long *cur_map = (unsigned long *)se->cur_valid_map;
|
|
|
|
unsigned long *ckpt_map = (unsigned long *)se->ckpt_valid_map;
|
2015-05-01 08:37:50 +03:00
|
|
|
unsigned long *discard_map = (unsigned long *)se->discard_map;
|
2015-02-11 03:44:29 +03:00
|
|
|
unsigned long *dmap = SIT_I(sbi)->tmp_map;
|
2013-11-12 09:49:56 +04:00
|
|
|
unsigned int start = 0, end = -1;
|
2014-09-21 09:06:39 +04:00
|
|
|
bool force = (cpc->reason == CP_DISCARD);
|
2013-11-12 09:49:56 +04:00
|
|
|
int i;
|
|
|
|
|
2016-08-02 20:56:40 +03:00
|
|
|
if (se->valid_blocks == max_blocks || !f2fs_discard_en(sbi))
|
2016-12-30 09:06:15 +03:00
|
|
|
return false;
|
2013-11-12 09:49:56 +04:00
|
|
|
|
2015-05-01 08:37:50 +03:00
|
|
|
if (!force) {
|
|
|
|
if (!test_opt(sbi, DISCARD) || !se->valid_blocks ||
|
2015-05-14 11:52:28 +03:00
|
|
|
SM_I(sbi)->nr_discards >= SM_I(sbi)->max_discards)
|
2016-12-30 09:06:15 +03:00
|
|
|
return false;
|
2014-09-21 09:06:39 +04:00
|
|
|
}
|
|
|
|
|
2013-11-12 09:49:56 +04:00
|
|
|
/* SIT_VBLOCK_MAP_SIZE should be multiple of sizeof(unsigned long) */
|
|
|
|
for (i = 0; i < entries; i++)
|
2015-05-01 08:37:50 +03:00
|
|
|
dmap[i] = force ? ~ckpt_map[i] & ~discard_map[i] :
|
2014-12-13 00:53:41 +03:00
|
|
|
(cur_map[i] ^ ckpt_map[i]) & ckpt_map[i];
|
2013-11-12 09:49:56 +04:00
|
|
|
|
2014-09-21 09:06:39 +04:00
|
|
|
while (force || SM_I(sbi)->nr_discards <= SM_I(sbi)->max_discards) {
|
2013-11-12 09:49:56 +04:00
|
|
|
start = __find_rev_next_bit(dmap, max_blocks, end + 1);
|
|
|
|
if (start >= max_blocks)
|
|
|
|
break;
|
|
|
|
|
|
|
|
end = __find_rev_next_zero_bit(dmap, max_blocks, start + 1);
|
2016-07-07 07:13:33 +03:00
|
|
|
if (force && start && end != max_blocks
|
|
|
|
&& (end - start) < cpc->trim_minlen)
|
|
|
|
continue;
|
|
|
|
|
2016-12-30 09:06:15 +03:00
|
|
|
if (check_only)
|
|
|
|
return true;
|
|
|
|
|
2015-05-01 08:37:50 +03:00
|
|
|
__add_discard_entry(sbi, cpc, se, start, end);
|
2013-11-12 09:49:56 +04:00
|
|
|
}
|
2016-12-30 09:06:15 +03:00
|
|
|
return false;
|
2013-11-12 09:49:56 +04:00
|
|
|
}
|
|
|
|
|
2014-09-21 09:06:39 +04:00
|
|
|
void release_discard_addrs(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct list_head *head = &(SM_I(sbi)->discard_list);
|
|
|
|
struct discard_entry *entry, *this;
|
|
|
|
|
|
|
|
/* drop caches */
|
|
|
|
list_for_each_entry_safe(entry, this, head, list) {
|
|
|
|
list_del(&entry->list);
|
|
|
|
kmem_cache_free(discard_entry_slab, entry);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2012-11-29 08:28:09 +04:00
|
|
|
/*
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* Should call clear_prefree_segments after checkpoint is done.
|
|
|
|
*/
|
|
|
|
static void set_prefree_as_free_segments(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
|
2014-08-04 06:10:07 +04:00
|
|
|
unsigned int segno;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
mutex_lock(&dirty_i->seglist_lock);
|
2014-09-23 22:23:01 +04:00
|
|
|
for_each_set_bit(segno, dirty_i->dirty_segmap[PRE], MAIN_SEGS(sbi))
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
__set_test_and_free(sbi, segno);
|
|
|
|
mutex_unlock(&dirty_i->seglist_lock);
|
|
|
|
}
|
|
|
|
|
2015-05-01 08:50:06 +03:00
|
|
|
void clear_prefree_segments(struct f2fs_sb_info *sbi, struct cp_control *cpc)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
2013-11-12 09:49:56 +04:00
|
|
|
struct list_head *head = &(SM_I(sbi)->discard_list);
|
2014-03-29 07:33:17 +04:00
|
|
|
struct discard_entry *entry, *this;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
|
2016-08-29 18:58:34 +03:00
|
|
|
struct blk_plug plug;
|
2013-11-11 04:24:37 +04:00
|
|
|
unsigned long *prefree_map = dirty_i->dirty_segmap[PRE];
|
|
|
|
unsigned int start = 0, end = -1;
|
2016-06-04 05:29:38 +03:00
|
|
|
unsigned int secno, start_segno;
|
f2fs: fix to avoid redundant discard during fstrim
With below test steps, f2fs will issue redundant discard when doing fstrim,
the reason is that we issue discards for both prefree segments and
consecutive freed region user wants to trim, part regions they covered are
overlapped, here, we change to do not to issue any discards for prefree
segments in trimmed range.
1. mount -t f2fs -o discard /dev/zram0 /mnt/f2fs
2. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/
3. dd if=/dev/zero of=/mnt/f2fs/a bs=2M count=1
4. dd if=/dev/zero of=/mnt/f2fs/b bs=1M count=1
5. sync
6. rm /mnt/f2fs/a /mnt/f2fs/b
7. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/
Before:
<...>-5428 [001] ...1 9511.052125: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x200
<...>-5428 [001] ...1 9511.052787: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300
After:
<...>-6764 [000] ...1 9720.382504: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-07 17:46:55 +03:00
|
|
|
bool force = (cpc->reason == CP_DISCARD);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
2016-08-29 18:58:34 +03:00
|
|
|
blk_start_plug(&plug);
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
mutex_lock(&dirty_i->seglist_lock);
|
2013-11-11 04:24:37 +04:00
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
while (1) {
|
2013-11-11 04:24:37 +04:00
|
|
|
int i;
|
2014-09-23 22:23:01 +04:00
|
|
|
start = find_next_bit(prefree_map, MAIN_SEGS(sbi), end + 1);
|
|
|
|
if (start >= MAIN_SEGS(sbi))
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
break;
|
2014-09-23 22:23:01 +04:00
|
|
|
end = find_next_zero_bit(prefree_map, MAIN_SEGS(sbi),
|
|
|
|
start + 1);
|
2013-11-11 04:24:37 +04:00
|
|
|
|
|
|
|
for (i = start; i < end; i++)
|
|
|
|
clear_bit(i, prefree_map);
|
|
|
|
|
|
|
|
dirty_i->nr_dirty[PRE] -= end - start;
|
|
|
|
|
2016-12-22 06:46:24 +03:00
|
|
|
if (!test_opt(sbi, DISCARD))
|
2013-11-11 04:24:37 +04:00
|
|
|
continue;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
2016-12-22 06:46:24 +03:00
|
|
|
if (force && start >= cpc->trim_start &&
|
|
|
|
(end - 1) <= cpc->trim_end)
|
|
|
|
continue;
|
|
|
|
|
2016-06-04 05:29:38 +03:00
|
|
|
if (!test_opt(sbi, LFS) || sbi->segs_per_sec == 1) {
|
|
|
|
f2fs_issue_discard(sbi, START_BLOCK(sbi, start),
|
2013-11-12 11:55:17 +04:00
|
|
|
(end - start) << sbi->log_blocks_per_seg);
|
2016-06-04 05:29:38 +03:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
next:
|
|
|
|
secno = GET_SECNO(sbi, start);
|
|
|
|
start_segno = secno * sbi->segs_per_sec;
|
|
|
|
if (!IS_CURSEC(sbi, secno) &&
|
|
|
|
!get_valid_blocks(sbi, start, sbi->segs_per_sec))
|
|
|
|
f2fs_issue_discard(sbi, START_BLOCK(sbi, start_segno),
|
|
|
|
sbi->segs_per_sec << sbi->log_blocks_per_seg);
|
|
|
|
|
|
|
|
start = start_segno + sbi->segs_per_sec;
|
|
|
|
if (start < end)
|
|
|
|
goto next;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
mutex_unlock(&dirty_i->seglist_lock);
|
2013-11-12 09:49:56 +04:00
|
|
|
|
|
|
|
/* send small discards */
|
2014-03-29 07:33:17 +04:00
|
|
|
list_for_each_entry_safe(entry, this, head, list) {
|
f2fs: fix to avoid redundant discard during fstrim
With below test steps, f2fs will issue redundant discard when doing fstrim,
the reason is that we issue discards for both prefree segments and
consecutive freed region user wants to trim, part regions they covered are
overlapped, here, we change to do not to issue any discards for prefree
segments in trimmed range.
1. mount -t f2fs -o discard /dev/zram0 /mnt/f2fs
2. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/
3. dd if=/dev/zero of=/mnt/f2fs/a bs=2M count=1
4. dd if=/dev/zero of=/mnt/f2fs/b bs=1M count=1
5. sync
6. rm /mnt/f2fs/a /mnt/f2fs/b
7. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/
Before:
<...>-5428 [001] ...1 9511.052125: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x200
<...>-5428 [001] ...1 9511.052787: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300
After:
<...>-6764 [000] ...1 9720.382504: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-07 17:46:55 +03:00
|
|
|
if (force && entry->len < cpc->trim_minlen)
|
2015-05-01 08:50:06 +03:00
|
|
|
goto skip;
|
2013-11-12 11:55:17 +04:00
|
|
|
f2fs_issue_discard(sbi, entry->blkaddr, entry->len);
|
2015-06-03 01:48:20 +03:00
|
|
|
cpc->trimmed += entry->len;
|
2015-05-01 08:50:06 +03:00
|
|
|
skip:
|
2013-11-12 09:49:56 +04:00
|
|
|
list_del(&entry->list);
|
|
|
|
SM_I(sbi)->nr_discards -= entry->len;
|
|
|
|
kmem_cache_free(discard_entry_slab, entry);
|
|
|
|
}
|
2016-08-29 18:58:34 +03:00
|
|
|
|
|
|
|
blk_finish_plug(&plug);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
static bool __mark_sit_entry_dirty(struct f2fs_sb_info *sbi, unsigned int segno)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
|
|
|
struct sit_info *sit_i = SIT_I(sbi);
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
|
|
|
|
if (!__test_and_set_bit(segno, sit_i->dirty_sentries_bitmap)) {
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
sit_i->dirty_sentries++;
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
return true;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void __set_sit_entry_type(struct f2fs_sb_info *sbi, int type,
|
|
|
|
unsigned int segno, int modified)
|
|
|
|
{
|
|
|
|
struct seg_entry *se = get_seg_entry(sbi, segno);
|
|
|
|
se->type = type;
|
|
|
|
if (modified)
|
|
|
|
__mark_sit_entry_dirty(sbi, segno);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void update_sit_entry(struct f2fs_sb_info *sbi, block_t blkaddr, int del)
|
|
|
|
{
|
|
|
|
struct seg_entry *se;
|
|
|
|
unsigned int segno, offset;
|
|
|
|
long int new_vblocks;
|
|
|
|
|
|
|
|
segno = GET_SEGNO(sbi, blkaddr);
|
|
|
|
|
|
|
|
se = get_seg_entry(sbi, segno);
|
|
|
|
new_vblocks = se->valid_blocks + del;
|
2014-02-04 08:01:10 +04:00
|
|
|
offset = GET_BLKOFF_FROM_SEG0(sbi, blkaddr);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
2014-09-03 02:52:58 +04:00
|
|
|
f2fs_bug_on(sbi, (new_vblocks >> (sizeof(unsigned short) << 3) ||
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
(new_vblocks > sbi->blocks_per_seg)));
|
|
|
|
|
|
|
|
se->valid_blocks = new_vblocks;
|
|
|
|
se->mtime = get_mtime(sbi);
|
|
|
|
SIT_I(sbi)->max_mtime = se->mtime;
|
|
|
|
|
|
|
|
/* Update valid block bitmap */
|
|
|
|
if (del > 0) {
|
2017-01-07 13:51:01 +03:00
|
|
|
if (f2fs_test_and_set_bit(offset, se->cur_valid_map)) {
|
|
|
|
#ifdef CONFIG_F2FS_CHECK_FS
|
|
|
|
if (f2fs_test_and_set_bit(offset,
|
|
|
|
se->cur_valid_map_mir))
|
|
|
|
f2fs_bug_on(sbi, 1);
|
|
|
|
else
|
|
|
|
WARN_ON(1);
|
|
|
|
#else
|
2014-09-03 03:05:00 +04:00
|
|
|
f2fs_bug_on(sbi, 1);
|
2017-01-07 13:51:01 +03:00
|
|
|
#endif
|
|
|
|
}
|
2016-08-02 20:56:40 +03:00
|
|
|
if (f2fs_discard_en(sbi) &&
|
|
|
|
!f2fs_test_and_set_bit(offset, se->discard_map))
|
2015-05-01 08:37:50 +03:00
|
|
|
sbi->discard_blks--;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
} else {
|
2017-01-07 13:51:01 +03:00
|
|
|
if (!f2fs_test_and_clear_bit(offset, se->cur_valid_map)) {
|
|
|
|
#ifdef CONFIG_F2FS_CHECK_FS
|
|
|
|
if (!f2fs_test_and_clear_bit(offset,
|
|
|
|
se->cur_valid_map_mir))
|
|
|
|
f2fs_bug_on(sbi, 1);
|
|
|
|
else
|
|
|
|
WARN_ON(1);
|
|
|
|
#else
|
2014-09-03 03:05:00 +04:00
|
|
|
f2fs_bug_on(sbi, 1);
|
2017-01-07 13:51:01 +03:00
|
|
|
#endif
|
|
|
|
}
|
2016-08-02 20:56:40 +03:00
|
|
|
if (f2fs_discard_en(sbi) &&
|
|
|
|
f2fs_test_and_clear_bit(offset, se->discard_map))
|
2015-05-01 08:37:50 +03:00
|
|
|
sbi->discard_blks++;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
if (!f2fs_test_bit(offset, se->ckpt_valid_map))
|
|
|
|
se->ckpt_valid_blocks += del;
|
|
|
|
|
|
|
|
__mark_sit_entry_dirty(sbi, segno);
|
|
|
|
|
|
|
|
/* update total number of valid blocks to be written in ckpt area */
|
|
|
|
SIT_I(sbi)->written_valid_blocks += del;
|
|
|
|
|
|
|
|
if (sbi->segs_per_sec > 1)
|
|
|
|
get_sec_entry(sbi, segno)->valid_blocks += del;
|
|
|
|
}
|
|
|
|
|
2014-01-28 07:22:14 +04:00
|
|
|
void refresh_sit_entry(struct f2fs_sb_info *sbi, block_t old, block_t new)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
2014-01-28 07:22:14 +04:00
|
|
|
update_sit_entry(sbi, new, 1);
|
|
|
|
if (GET_SEGNO(sbi, old) != NULL_SEGNO)
|
|
|
|
update_sit_entry(sbi, old, -1);
|
|
|
|
|
|
|
|
locate_dirty_segment(sbi, GET_SEGNO(sbi, old));
|
|
|
|
locate_dirty_segment(sbi, GET_SEGNO(sbi, new));
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
void invalidate_blocks(struct f2fs_sb_info *sbi, block_t addr)
|
|
|
|
{
|
|
|
|
unsigned int segno = GET_SEGNO(sbi, addr);
|
|
|
|
struct sit_info *sit_i = SIT_I(sbi);
|
|
|
|
|
2014-09-03 02:52:58 +04:00
|
|
|
f2fs_bug_on(sbi, addr == NULL_ADDR);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (addr == NEW_ADDR)
|
|
|
|
return;
|
|
|
|
|
|
|
|
/* add it into sit main buffer */
|
|
|
|
mutex_lock(&sit_i->sentry_lock);
|
|
|
|
|
|
|
|
update_sit_entry(sbi, addr, -1);
|
|
|
|
|
|
|
|
/* add it into dirty seglist */
|
|
|
|
locate_dirty_segment(sbi, segno);
|
|
|
|
|
|
|
|
mutex_unlock(&sit_i->sentry_lock);
|
|
|
|
}
|
|
|
|
|
2015-10-07 22:28:41 +03:00
|
|
|
bool is_checkpointed_data(struct f2fs_sb_info *sbi, block_t blkaddr)
|
|
|
|
{
|
|
|
|
struct sit_info *sit_i = SIT_I(sbi);
|
|
|
|
unsigned int segno, offset;
|
|
|
|
struct seg_entry *se;
|
|
|
|
bool is_cp = false;
|
|
|
|
|
|
|
|
if (blkaddr == NEW_ADDR || blkaddr == NULL_ADDR)
|
|
|
|
return true;
|
|
|
|
|
|
|
|
mutex_lock(&sit_i->sentry_lock);
|
|
|
|
|
|
|
|
segno = GET_SEGNO(sbi, blkaddr);
|
|
|
|
se = get_seg_entry(sbi, segno);
|
|
|
|
offset = GET_BLKOFF_FROM_SEG0(sbi, blkaddr);
|
|
|
|
|
|
|
|
if (f2fs_test_bit(offset, se->ckpt_valid_map))
|
|
|
|
is_cp = true;
|
|
|
|
|
|
|
|
mutex_unlock(&sit_i->sentry_lock);
|
|
|
|
|
|
|
|
return is_cp;
|
|
|
|
}
|
|
|
|
|
2012-11-29 08:28:09 +04:00
|
|
|
/*
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* This function should be resided under the curseg_mutex lock
|
|
|
|
*/
|
|
|
|
static void __add_sum_entry(struct f2fs_sb_info *sbi, int type,
|
2013-06-13 12:59:27 +04:00
|
|
|
struct f2fs_summary *sum)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, type);
|
|
|
|
void *addr = curseg->sum_blk;
|
2013-06-13 12:59:27 +04:00
|
|
|
addr += curseg->next_blkoff * sizeof(struct f2fs_summary);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
memcpy(addr, sum, sizeof(struct f2fs_summary));
|
|
|
|
}
|
|
|
|
|
2012-11-29 08:28:09 +04:00
|
|
|
/*
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* Calculate the number of current summary pages for writing
|
|
|
|
*/
|
2014-12-09 09:21:46 +03:00
|
|
|
int npages_for_summary_flush(struct f2fs_sb_info *sbi, bool for_ra)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
|
|
|
int valid_sum_count = 0;
|
2013-10-29 12:21:47 +04:00
|
|
|
int i, sum_in_page;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) {
|
|
|
|
if (sbi->ckpt->alloc_type[i] == SSR)
|
|
|
|
valid_sum_count += sbi->blocks_per_seg;
|
2014-12-09 09:21:46 +03:00
|
|
|
else {
|
|
|
|
if (for_ra)
|
|
|
|
valid_sum_count += le16_to_cpu(
|
|
|
|
F2FS_CKPT(sbi)->cur_data_blkoff[i]);
|
|
|
|
else
|
|
|
|
valid_sum_count += curseg_blkoff(sbi, i);
|
|
|
|
}
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 15:29:47 +03:00
|
|
|
sum_in_page = (PAGE_SIZE - 2 * SUM_JOURNAL_SIZE -
|
2013-10-29 12:21:47 +04:00
|
|
|
SUM_FOOTER_SIZE) / SUMMARY_SIZE;
|
|
|
|
if (valid_sum_count <= sum_in_page)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
return 1;
|
2013-10-29 12:21:47 +04:00
|
|
|
else if ((valid_sum_count - sum_in_page) <=
|
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 15:29:47 +03:00
|
|
|
(PAGE_SIZE - SUM_FOOTER_SIZE) / SUMMARY_SIZE)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
return 2;
|
|
|
|
return 3;
|
|
|
|
}
|
|
|
|
|
2012-11-29 08:28:09 +04:00
|
|
|
/*
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* Caller should put this summary page
|
|
|
|
*/
|
|
|
|
struct page *get_sum_page(struct f2fs_sb_info *sbi, unsigned int segno)
|
|
|
|
{
|
|
|
|
return get_meta_page(sbi, GET_SUM_BLOCK(sbi, segno));
|
|
|
|
}
|
|
|
|
|
2015-05-19 12:40:04 +03:00
|
|
|
void update_meta_page(struct f2fs_sb_info *sbi, void *src, block_t blk_addr)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
|
|
|
struct page *page = grab_meta_page(sbi, blk_addr);
|
2015-05-19 12:40:04 +03:00
|
|
|
void *dst = page_address(page);
|
|
|
|
|
|
|
|
if (src)
|
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 15:29:47 +03:00
|
|
|
memcpy(dst, src, PAGE_SIZE);
|
2015-05-19 12:40:04 +03:00
|
|
|
else
|
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 15:29:47 +03:00
|
|
|
memset(dst, 0, PAGE_SIZE);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
set_page_dirty(page);
|
|
|
|
f2fs_put_page(page, 1);
|
|
|
|
}
|
|
|
|
|
2015-05-19 12:40:04 +03:00
|
|
|
static void write_sum_page(struct f2fs_sb_info *sbi,
|
|
|
|
struct f2fs_summary_block *sum_blk, block_t blk_addr)
|
|
|
|
{
|
|
|
|
update_meta_page(sbi, (void *)sum_blk, blk_addr);
|
|
|
|
}
|
|
|
|
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
static void write_current_sum_page(struct f2fs_sb_info *sbi,
|
|
|
|
int type, block_t blk_addr)
|
|
|
|
{
|
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, type);
|
|
|
|
struct page *page = grab_meta_page(sbi, blk_addr);
|
|
|
|
struct f2fs_summary_block *src = curseg->sum_blk;
|
|
|
|
struct f2fs_summary_block *dst;
|
|
|
|
|
|
|
|
dst = (struct f2fs_summary_block *)page_address(page);
|
|
|
|
|
|
|
|
mutex_lock(&curseg->curseg_mutex);
|
|
|
|
|
|
|
|
down_read(&curseg->journal_rwsem);
|
|
|
|
memcpy(&dst->journal, curseg->journal, SUM_JOURNAL_SIZE);
|
|
|
|
up_read(&curseg->journal_rwsem);
|
|
|
|
|
|
|
|
memcpy(dst->entries, src->entries, SUM_ENTRY_SIZE);
|
|
|
|
memcpy(&dst->footer, &src->footer, SUM_FOOTER_SIZE);
|
|
|
|
|
|
|
|
mutex_unlock(&curseg->curseg_mutex);
|
|
|
|
|
|
|
|
set_page_dirty(page);
|
|
|
|
f2fs_put_page(page, 1);
|
|
|
|
}
|
|
|
|
|
2013-03-31 08:58:51 +04:00
|
|
|
static int is_next_segment_free(struct f2fs_sb_info *sbi, int type)
|
|
|
|
{
|
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, type);
|
2013-05-14 14:20:28 +04:00
|
|
|
unsigned int segno = curseg->segno + 1;
|
2013-03-31 08:58:51 +04:00
|
|
|
struct free_segmap_info *free_i = FREE_I(sbi);
|
|
|
|
|
2014-09-23 22:23:01 +04:00
|
|
|
if (segno < MAIN_SEGS(sbi) && segno % sbi->segs_per_sec)
|
2013-05-14 14:20:28 +04:00
|
|
|
return !test_bit(segno, free_i->free_segmap);
|
2013-03-31 08:58:51 +04:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2012-11-29 08:28:09 +04:00
|
|
|
/*
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* Find a new segment from the free segments bitmap to right order
|
|
|
|
* This function should be returned with success, otherwise BUG
|
|
|
|
*/
|
|
|
|
static void get_new_segment(struct f2fs_sb_info *sbi,
|
|
|
|
unsigned int *newseg, bool new_sec, int dir)
|
|
|
|
{
|
|
|
|
struct free_segmap_info *free_i = FREE_I(sbi);
|
|
|
|
unsigned int segno, secno, zoneno;
|
2014-09-23 22:23:01 +04:00
|
|
|
unsigned int total_zones = MAIN_SECS(sbi) / sbi->secs_per_zone;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
unsigned int hint = *newseg / sbi->segs_per_sec;
|
|
|
|
unsigned int old_zoneno = GET_ZONENO_FROM_SEGNO(sbi, *newseg);
|
|
|
|
unsigned int left_start = hint;
|
|
|
|
bool init = true;
|
|
|
|
int go_left = 0;
|
|
|
|
int i;
|
|
|
|
|
2015-02-11 13:20:38 +03:00
|
|
|
spin_lock(&free_i->segmap_lock);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
if (!new_sec && ((*newseg + 1) % sbi->segs_per_sec)) {
|
|
|
|
segno = find_next_zero_bit(free_i->free_segmap,
|
2016-01-22 12:42:06 +03:00
|
|
|
(hint + 1) * sbi->segs_per_sec, *newseg + 1);
|
|
|
|
if (segno < (hint + 1) * sbi->segs_per_sec)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
goto got_it;
|
|
|
|
}
|
|
|
|
find_other_zone:
|
2014-09-23 22:23:01 +04:00
|
|
|
secno = find_next_zero_bit(free_i->free_secmap, MAIN_SECS(sbi), hint);
|
|
|
|
if (secno >= MAIN_SECS(sbi)) {
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (dir == ALLOC_RIGHT) {
|
|
|
|
secno = find_next_zero_bit(free_i->free_secmap,
|
2014-09-23 22:23:01 +04:00
|
|
|
MAIN_SECS(sbi), 0);
|
|
|
|
f2fs_bug_on(sbi, secno >= MAIN_SECS(sbi));
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
} else {
|
|
|
|
go_left = 1;
|
|
|
|
left_start = hint - 1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (go_left == 0)
|
|
|
|
goto skip_left;
|
|
|
|
|
|
|
|
while (test_bit(left_start, free_i->free_secmap)) {
|
|
|
|
if (left_start > 0) {
|
|
|
|
left_start--;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
left_start = find_next_zero_bit(free_i->free_secmap,
|
2014-09-23 22:23:01 +04:00
|
|
|
MAIN_SECS(sbi), 0);
|
|
|
|
f2fs_bug_on(sbi, left_start >= MAIN_SECS(sbi));
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
secno = left_start;
|
|
|
|
skip_left:
|
|
|
|
hint = secno;
|
|
|
|
segno = secno * sbi->segs_per_sec;
|
|
|
|
zoneno = secno / sbi->secs_per_zone;
|
|
|
|
|
|
|
|
/* give up on finding another zone */
|
|
|
|
if (!init)
|
|
|
|
goto got_it;
|
|
|
|
if (sbi->secs_per_zone == 1)
|
|
|
|
goto got_it;
|
|
|
|
if (zoneno == old_zoneno)
|
|
|
|
goto got_it;
|
|
|
|
if (dir == ALLOC_LEFT) {
|
|
|
|
if (!go_left && zoneno + 1 >= total_zones)
|
|
|
|
goto got_it;
|
|
|
|
if (go_left && zoneno == 0)
|
|
|
|
goto got_it;
|
|
|
|
}
|
|
|
|
for (i = 0; i < NR_CURSEG_TYPE; i++)
|
|
|
|
if (CURSEG_I(sbi, i)->zone == zoneno)
|
|
|
|
break;
|
|
|
|
|
|
|
|
if (i < NR_CURSEG_TYPE) {
|
|
|
|
/* zone is in user, try another */
|
|
|
|
if (go_left)
|
|
|
|
hint = zoneno * sbi->secs_per_zone - 1;
|
|
|
|
else if (zoneno + 1 >= total_zones)
|
|
|
|
hint = 0;
|
|
|
|
else
|
|
|
|
hint = (zoneno + 1) * sbi->secs_per_zone;
|
|
|
|
init = false;
|
|
|
|
goto find_other_zone;
|
|
|
|
}
|
|
|
|
got_it:
|
|
|
|
/* set it as dirty segment in free segmap */
|
2014-09-03 02:52:58 +04:00
|
|
|
f2fs_bug_on(sbi, test_bit(segno, free_i->free_segmap));
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
__set_inuse(sbi, segno);
|
|
|
|
*newseg = segno;
|
2015-02-11 13:20:38 +03:00
|
|
|
spin_unlock(&free_i->segmap_lock);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void reset_curseg(struct f2fs_sb_info *sbi, int type, int modified)
|
|
|
|
{
|
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, type);
|
|
|
|
struct summary_footer *sum_footer;
|
|
|
|
|
|
|
|
curseg->segno = curseg->next_segno;
|
|
|
|
curseg->zone = GET_ZONENO_FROM_SEGNO(sbi, curseg->segno);
|
|
|
|
curseg->next_blkoff = 0;
|
|
|
|
curseg->next_segno = NULL_SEGNO;
|
|
|
|
|
|
|
|
sum_footer = &(curseg->sum_blk->footer);
|
|
|
|
memset(sum_footer, 0, sizeof(struct summary_footer));
|
|
|
|
if (IS_DATASEG(type))
|
|
|
|
SET_SUM_TYPE(sum_footer, SUM_TYPE_DATA);
|
|
|
|
if (IS_NODESEG(type))
|
|
|
|
SET_SUM_TYPE(sum_footer, SUM_TYPE_NODE);
|
|
|
|
__set_sit_entry_type(sbi, type, curseg->segno, modified);
|
|
|
|
}
|
|
|
|
|
2012-11-29 08:28:09 +04:00
|
|
|
/*
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* Allocate a current working segment.
|
|
|
|
* This function always allocates a free segment in LFS manner.
|
|
|
|
*/
|
|
|
|
static void new_curseg(struct f2fs_sb_info *sbi, int type, bool new_sec)
|
|
|
|
{
|
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, type);
|
|
|
|
unsigned int segno = curseg->segno;
|
|
|
|
int dir = ALLOC_LEFT;
|
|
|
|
|
|
|
|
write_sum_page(sbi, curseg->sum_blk,
|
2013-05-14 14:20:28 +04:00
|
|
|
GET_SUM_BLOCK(sbi, segno));
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (type == CURSEG_WARM_DATA || type == CURSEG_COLD_DATA)
|
|
|
|
dir = ALLOC_RIGHT;
|
|
|
|
|
|
|
|
if (test_opt(sbi, NOHEAP))
|
|
|
|
dir = ALLOC_RIGHT;
|
|
|
|
|
|
|
|
get_new_segment(sbi, &segno, new_sec, dir);
|
|
|
|
curseg->next_segno = segno;
|
|
|
|
reset_curseg(sbi, type, 1);
|
|
|
|
curseg->alloc_type = LFS;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void __next_free_blkoff(struct f2fs_sb_info *sbi,
|
|
|
|
struct curseg_info *seg, block_t start)
|
|
|
|
{
|
|
|
|
struct seg_entry *se = get_seg_entry(sbi, seg->segno);
|
2013-11-15 08:21:16 +04:00
|
|
|
int entries = SIT_VBLOCK_MAP_SIZE / sizeof(unsigned long);
|
2015-02-11 03:44:29 +03:00
|
|
|
unsigned long *target_map = SIT_I(sbi)->tmp_map;
|
2013-11-15 08:21:16 +04:00
|
|
|
unsigned long *ckpt_map = (unsigned long *)se->ckpt_valid_map;
|
|
|
|
unsigned long *cur_map = (unsigned long *)se->cur_valid_map;
|
|
|
|
int i, pos;
|
|
|
|
|
|
|
|
for (i = 0; i < entries; i++)
|
|
|
|
target_map[i] = ckpt_map[i] | cur_map[i];
|
|
|
|
|
|
|
|
pos = __find_rev_next_zero_bit(target_map, sbi->blocks_per_seg, start);
|
|
|
|
|
|
|
|
seg->next_blkoff = pos;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
2012-11-29 08:28:09 +04:00
|
|
|
/*
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* If a segment is written by LFS manner, next block offset is just obtained
|
|
|
|
* by increasing the current block offset. However, if a segment is written by
|
|
|
|
* SSR manner, next block offset obtained by calling __next_free_blkoff
|
|
|
|
*/
|
|
|
|
static void __refresh_next_blkoff(struct f2fs_sb_info *sbi,
|
|
|
|
struct curseg_info *seg)
|
|
|
|
{
|
|
|
|
if (seg->alloc_type == SSR)
|
|
|
|
__next_free_blkoff(sbi, seg, seg->next_blkoff + 1);
|
|
|
|
else
|
|
|
|
seg->next_blkoff++;
|
|
|
|
}
|
|
|
|
|
2012-11-29 08:28:09 +04:00
|
|
|
/*
|
2014-08-06 18:22:50 +04:00
|
|
|
* This function always allocates a used segment(from dirty seglist) by SSR
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* manner, so it should recover the existing segment information of valid blocks
|
|
|
|
*/
|
|
|
|
static void change_curseg(struct f2fs_sb_info *sbi, int type, bool reuse)
|
|
|
|
{
|
|
|
|
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
|
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, type);
|
|
|
|
unsigned int new_segno = curseg->next_segno;
|
|
|
|
struct f2fs_summary_block *sum_node;
|
|
|
|
struct page *sum_page;
|
|
|
|
|
|
|
|
write_sum_page(sbi, curseg->sum_blk,
|
|
|
|
GET_SUM_BLOCK(sbi, curseg->segno));
|
|
|
|
__set_test_and_inuse(sbi, new_segno);
|
|
|
|
|
|
|
|
mutex_lock(&dirty_i->seglist_lock);
|
|
|
|
__remove_dirty_segment(sbi, new_segno, PRE);
|
|
|
|
__remove_dirty_segment(sbi, new_segno, DIRTY);
|
|
|
|
mutex_unlock(&dirty_i->seglist_lock);
|
|
|
|
|
|
|
|
reset_curseg(sbi, type, 1);
|
|
|
|
curseg->alloc_type = SSR;
|
|
|
|
__next_free_blkoff(sbi, curseg, 0);
|
|
|
|
|
|
|
|
if (reuse) {
|
|
|
|
sum_page = get_sum_page(sbi, new_segno);
|
|
|
|
sum_node = (struct f2fs_summary_block *)page_address(sum_page);
|
|
|
|
memcpy(curseg->sum_blk, sum_node, SUM_ENTRY_SIZE);
|
|
|
|
f2fs_put_page(sum_page, 1);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-02-04 10:11:17 +04:00
|
|
|
static int get_ssr_segment(struct f2fs_sb_info *sbi, int type)
|
|
|
|
{
|
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, type);
|
|
|
|
const struct victim_selection *v_ops = DIRTY_I(sbi)->v_ops;
|
|
|
|
|
2016-09-01 22:02:51 +03:00
|
|
|
if (IS_NODESEG(type) || !has_not_enough_free_secs(sbi, 0, 0))
|
2013-02-04 10:11:17 +04:00
|
|
|
return v_ops->get_victim(sbi,
|
|
|
|
&(curseg)->next_segno, BG_GC, type, SSR);
|
|
|
|
|
|
|
|
/* For data segments, let's do SSR more intensively */
|
|
|
|
for (; type >= CURSEG_HOT_DATA; type--)
|
|
|
|
if (v_ops->get_victim(sbi, &(curseg)->next_segno,
|
|
|
|
BG_GC, type, SSR))
|
|
|
|
return 1;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
/*
|
|
|
|
* flush out current segment and replace it with new segment
|
|
|
|
* This function should be returned with success, otherwise BUG
|
|
|
|
*/
|
|
|
|
static void allocate_segment_by_default(struct f2fs_sb_info *sbi,
|
|
|
|
int type, bool force)
|
|
|
|
{
|
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, type);
|
|
|
|
|
2013-08-19 05:41:15 +04:00
|
|
|
if (force)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
new_curseg(sbi, type, true);
|
2013-08-19 05:41:15 +04:00
|
|
|
else if (type == CURSEG_WARM_NODE)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
new_curseg(sbi, type, false);
|
2013-03-31 08:58:51 +04:00
|
|
|
else if (curseg->alloc_type == LFS && is_next_segment_free(sbi, type))
|
|
|
|
new_curseg(sbi, type, false);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
else if (need_SSR(sbi) && get_ssr_segment(sbi, type))
|
|
|
|
change_curseg(sbi, type, true);
|
|
|
|
else
|
|
|
|
new_curseg(sbi, type, false);
|
2013-10-22 15:56:10 +04:00
|
|
|
|
|
|
|
stat_inc_seg_type(sbi, curseg);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
void allocate_new_segments(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
2016-11-11 23:31:40 +03:00
|
|
|
struct curseg_info *curseg;
|
|
|
|
unsigned int old_segno;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
int i;
|
|
|
|
|
2016-11-11 23:31:40 +03:00
|
|
|
for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) {
|
|
|
|
curseg = CURSEG_I(sbi, i);
|
|
|
|
old_segno = curseg->segno;
|
|
|
|
SIT_I(sbi)->s_ops->allocate_segment(sbi, i, true);
|
|
|
|
locate_dirty_segment(sbi, old_segno);
|
|
|
|
}
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static const struct segment_allocation default_salloc_ops = {
|
|
|
|
.allocate_segment = allocate_segment_by_default,
|
|
|
|
};
|
|
|
|
|
2016-12-30 09:06:15 +03:00
|
|
|
bool exist_trim_candidates(struct f2fs_sb_info *sbi, struct cp_control *cpc)
|
|
|
|
{
|
|
|
|
__u64 trim_start = cpc->trim_start;
|
|
|
|
bool has_candidate = false;
|
|
|
|
|
|
|
|
mutex_lock(&SIT_I(sbi)->sentry_lock);
|
|
|
|
for (; cpc->trim_start <= cpc->trim_end; cpc->trim_start++) {
|
|
|
|
if (add_discard_addrs(sbi, cpc, true)) {
|
|
|
|
has_candidate = true;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
mutex_unlock(&SIT_I(sbi)->sentry_lock);
|
|
|
|
|
|
|
|
cpc->trim_start = trim_start;
|
|
|
|
return has_candidate;
|
|
|
|
}
|
|
|
|
|
2014-09-21 09:06:39 +04:00
|
|
|
int f2fs_trim_fs(struct f2fs_sb_info *sbi, struct fstrim_range *range)
|
|
|
|
{
|
2015-02-09 23:02:44 +03:00
|
|
|
__u64 start = F2FS_BYTES_TO_BLK(range->start);
|
|
|
|
__u64 end = start + F2FS_BYTES_TO_BLK(range->len) - 1;
|
2014-09-21 09:06:39 +04:00
|
|
|
unsigned int start_segno, end_segno;
|
|
|
|
struct cp_control cpc;
|
2015-12-23 12:50:30 +03:00
|
|
|
int err = 0;
|
2014-09-21 09:06:39 +04:00
|
|
|
|
2015-05-01 08:50:06 +03:00
|
|
|
if (start >= MAX_BLKADDR(sbi) || range->len < sbi->blocksize)
|
2014-09-21 09:06:39 +04:00
|
|
|
return -EINVAL;
|
|
|
|
|
2014-10-21 16:07:33 +04:00
|
|
|
cpc.trimmed = 0;
|
2014-09-23 22:23:01 +04:00
|
|
|
if (end <= MAIN_BLKADDR(sbi))
|
2014-09-21 09:06:39 +04:00
|
|
|
goto out;
|
|
|
|
|
2016-09-01 05:14:39 +03:00
|
|
|
if (is_sbi_flag_set(sbi, SBI_NEED_FSCK)) {
|
|
|
|
f2fs_msg(sbi->sb, KERN_WARNING,
|
|
|
|
"Found FS corruption, run fsck to fix.");
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2014-09-21 09:06:39 +04:00
|
|
|
/* start/end segment number in main_area */
|
2014-09-23 22:23:01 +04:00
|
|
|
start_segno = (start <= MAIN_BLKADDR(sbi)) ? 0 : GET_SEGNO(sbi, start);
|
|
|
|
end_segno = (end >= MAX_BLKADDR(sbi)) ? MAIN_SEGS(sbi) - 1 :
|
|
|
|
GET_SEGNO(sbi, end);
|
2014-09-21 09:06:39 +04:00
|
|
|
cpc.reason = CP_DISCARD;
|
2015-05-01 08:50:06 +03:00
|
|
|
cpc.trim_minlen = max_t(__u64, 1, F2FS_BYTES_TO_BLK(range->minlen));
|
2014-09-21 09:06:39 +04:00
|
|
|
|
|
|
|
/* do checkpoint to issue discard commands safely */
|
2015-01-27 04:41:23 +03:00
|
|
|
for (; start_segno <= end_segno; start_segno = cpc.trim_end + 1) {
|
|
|
|
cpc.trim_start = start_segno;
|
2015-05-01 08:37:50 +03:00
|
|
|
|
|
|
|
if (sbi->discard_blks == 0)
|
|
|
|
break;
|
|
|
|
else if (sbi->discard_blks < BATCHED_TRIM_BLOCKS(sbi))
|
|
|
|
cpc.trim_end = end_segno;
|
|
|
|
else
|
|
|
|
cpc.trim_end = min_t(unsigned int,
|
|
|
|
rounddown(start_segno +
|
2015-01-27 04:41:23 +03:00
|
|
|
BATCHED_TRIM_SEGMENTS(sbi),
|
|
|
|
sbi->segs_per_sec) - 1, end_segno);
|
|
|
|
|
|
|
|
mutex_lock(&sbi->gc_mutex);
|
2015-12-23 12:50:30 +03:00
|
|
|
err = write_checkpoint(sbi, &cpc);
|
2015-01-27 04:41:23 +03:00
|
|
|
mutex_unlock(&sbi->gc_mutex);
|
2016-08-21 18:21:29 +03:00
|
|
|
if (err)
|
|
|
|
break;
|
2016-08-21 18:21:30 +03:00
|
|
|
|
|
|
|
schedule();
|
2015-01-27 04:41:23 +03:00
|
|
|
}
|
2014-09-21 09:06:39 +04:00
|
|
|
out:
|
2015-02-09 23:02:44 +03:00
|
|
|
range->len = F2FS_BLK_TO_BYTES(cpc.trimmed);
|
2015-12-23 12:50:30 +03:00
|
|
|
return err;
|
2014-09-21 09:06:39 +04:00
|
|
|
}
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
static bool __has_curseg_space(struct f2fs_sb_info *sbi, int type)
|
|
|
|
{
|
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, type);
|
|
|
|
if (curseg->next_blkoff < sbi->blocks_per_seg)
|
|
|
|
return true;
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int __get_segment_type_2(struct page *page, enum page_type p_type)
|
|
|
|
{
|
|
|
|
if (p_type == DATA)
|
|
|
|
return CURSEG_HOT_DATA;
|
|
|
|
else
|
|
|
|
return CURSEG_HOT_NODE;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int __get_segment_type_4(struct page *page, enum page_type p_type)
|
|
|
|
{
|
|
|
|
if (p_type == DATA) {
|
|
|
|
struct inode *inode = page->mapping->host;
|
|
|
|
|
|
|
|
if (S_ISDIR(inode->i_mode))
|
|
|
|
return CURSEG_HOT_DATA;
|
|
|
|
else
|
|
|
|
return CURSEG_COLD_DATA;
|
|
|
|
} else {
|
2014-11-06 07:05:53 +03:00
|
|
|
if (IS_DNODE(page) && is_cold_node(page))
|
|
|
|
return CURSEG_WARM_NODE;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
else
|
|
|
|
return CURSEG_COLD_NODE;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static int __get_segment_type_6(struct page *page, enum page_type p_type)
|
|
|
|
{
|
|
|
|
if (p_type == DATA) {
|
|
|
|
struct inode *inode = page->mapping->host;
|
|
|
|
|
|
|
|
if (S_ISDIR(inode->i_mode))
|
|
|
|
return CURSEG_HOT_DATA;
|
2013-06-14 03:52:35 +04:00
|
|
|
else if (is_cold_data(page) || file_is_cold(inode))
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
return CURSEG_COLD_DATA;
|
|
|
|
else
|
|
|
|
return CURSEG_WARM_DATA;
|
|
|
|
} else {
|
|
|
|
if (IS_DNODE(page))
|
|
|
|
return is_cold_node(page) ? CURSEG_WARM_NODE :
|
|
|
|
CURSEG_HOT_NODE;
|
|
|
|
else
|
|
|
|
return CURSEG_COLD_NODE;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static int __get_segment_type(struct page *page, enum page_type p_type)
|
|
|
|
{
|
2014-09-03 02:31:18 +04:00
|
|
|
switch (F2FS_P_SB(page)->active_logs) {
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
case 2:
|
|
|
|
return __get_segment_type_2(page, p_type);
|
|
|
|
case 4:
|
|
|
|
return __get_segment_type_4(page, p_type);
|
|
|
|
}
|
2012-12-21 06:47:05 +04:00
|
|
|
/* NR_CURSEG_TYPE(6) logs by default */
|
2014-09-03 02:52:58 +04:00
|
|
|
f2fs_bug_on(F2FS_P_SB(page),
|
|
|
|
F2FS_P_SB(page)->active_logs != NR_CURSEG_TYPE);
|
2012-12-21 06:47:05 +04:00
|
|
|
return __get_segment_type_6(page, p_type);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
2013-12-16 14:04:05 +04:00
|
|
|
void allocate_data_block(struct f2fs_sb_info *sbi, struct page *page,
|
|
|
|
block_t old_blkaddr, block_t *new_blkaddr,
|
|
|
|
struct f2fs_summary *sum, int type)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
|
|
|
struct sit_info *sit_i = SIT_I(sbi);
|
2016-11-11 23:31:40 +03:00
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, type);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
mutex_lock(&curseg->curseg_mutex);
|
2015-03-11 20:42:48 +03:00
|
|
|
mutex_lock(&sit_i->sentry_lock);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
*new_blkaddr = NEXT_FREE_BLKADDR(sbi, curseg);
|
|
|
|
|
2016-12-30 01:07:53 +03:00
|
|
|
f2fs_wait_discard_bio(sbi, *new_blkaddr);
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
/*
|
|
|
|
* __add_sum_entry should be resided under the curseg_mutex
|
|
|
|
* because, this function updates a summary entry in the
|
|
|
|
* current summary block.
|
|
|
|
*/
|
2013-06-13 12:59:27 +04:00
|
|
|
__add_sum_entry(sbi, type, sum);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
__refresh_next_blkoff(sbi, curseg);
|
2013-10-22 15:56:10 +04:00
|
|
|
|
|
|
|
stat_inc_block_count(sbi, curseg);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
2014-01-28 07:22:14 +04:00
|
|
|
if (!__has_curseg_space(sbi, type))
|
|
|
|
sit_i->s_ops->allocate_segment(sbi, type, false);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
/*
|
|
|
|
* SIT information should be updated before segment allocation,
|
|
|
|
* since SSR needs latest valid block information.
|
|
|
|
*/
|
|
|
|
refresh_sit_entry(sbi, old_blkaddr, *new_blkaddr);
|
2014-01-28 07:22:14 +04:00
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
mutex_unlock(&sit_i->sentry_lock);
|
|
|
|
|
2013-12-16 14:04:05 +04:00
|
|
|
if (page && IS_NODESEG(type))
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
fill_node_footer_blkaddr(page, NEXT_FREE_BLKADDR(sbi, curseg));
|
|
|
|
|
2013-12-16 14:04:05 +04:00
|
|
|
mutex_unlock(&curseg->curseg_mutex);
|
|
|
|
}
|
|
|
|
|
2015-04-24 00:38:15 +03:00
|
|
|
static void do_write_page(struct f2fs_summary *sum, struct f2fs_io_info *fio)
|
2013-12-16 14:04:05 +04:00
|
|
|
{
|
2015-04-24 00:38:15 +03:00
|
|
|
int type = __get_segment_type(fio->page, fio->type);
|
2016-12-14 21:12:56 +03:00
|
|
|
int err;
|
2013-12-16 14:04:05 +04:00
|
|
|
|
2016-06-05 00:21:28 +03:00
|
|
|
if (fio->type == NODE || fio->type == DATA)
|
|
|
|
mutex_lock(&fio->sbi->wio_mutex[fio->type]);
|
2016-12-14 21:12:56 +03:00
|
|
|
reallocate:
|
f2fs: trace old block address for CoWed page
This patch enables to trace old block address of CoWed page for better
debugging.
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f0, oldaddr = 0xfe8ab, newaddr = 0xfee90 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f8, oldaddr = 0xfe8b0, newaddr = 0xfee91 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4fa, oldaddr = 0xfe8ae, newaddr = 0xfee92 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x96, oldaddr = 0xf049b, newaddr = 0x2bbe rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x97, oldaddr = 0xf049c, newaddr = 0x2bbf rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x98, oldaddr = 0xf049d, newaddr = 0x2bc0 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x47, oldaddr = 0xffffffff, newaddr = 0xf2631 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x48, oldaddr = 0xffffffff, newaddr = 0xf2632 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x49, oldaddr = 0xffffffff, newaddr = 0xf2633 rw = WRITE, type = DATA
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-22 13:36:38 +03:00
|
|
|
allocate_data_block(fio->sbi, fio->page, fio->old_blkaddr,
|
|
|
|
&fio->new_blkaddr, sum, type);
|
2013-12-16 14:04:05 +04:00
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
/* writeout dirty page into bdev */
|
2016-12-14 21:12:56 +03:00
|
|
|
err = f2fs_submit_page_mbio(fio);
|
|
|
|
if (err == -EAGAIN) {
|
|
|
|
fio->old_blkaddr = fio->new_blkaddr;
|
|
|
|
goto reallocate;
|
|
|
|
}
|
2016-06-05 00:21:28 +03:00
|
|
|
|
|
|
|
if (fio->type == NODE || fio->type == DATA)
|
|
|
|
mutex_unlock(&fio->sbi->wio_mutex[fio->type]);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
f2fs: prevent checkpoint once any IO failure is detected
This patch enhances the checkpoint routine to cope with IO errors.
Basically f2fs detects IO errors from end_io_write, and the errors are able to
be occurred during one of data, node, and meta page writes.
In the previous code, when an IO error is occurred during writes, f2fs sets a
flag, CP_ERROR_FLAG, in the raw ckeckpoint buffer which will be written to disk.
Afterwards, write_checkpoint() will check the flag and remount f2fs as a
read-only (ro) mode.
However, even once f2fs is remounted as a ro mode, dirty checkpoint pages are
freely able to be written to disk by flusher or kswapd in background.
In such a case, after cold reboot, f2fs would restore the checkpoint data having
CP_ERROR_FLAG, resulting in disabling write_checkpoint and remounting f2fs as
a ro mode again.
Therefore, let's prevent any checkpoint page (meta) writes once an IO error is
occurred, and remount f2fs as a ro mode right away at that moment.
Reported-by: Oliver Winker <oliver@oli1170.net>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
2013-01-24 14:56:11 +04:00
|
|
|
void write_meta_page(struct f2fs_sb_info *sbi, struct page *page)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
2013-12-11 08:54:01 +04:00
|
|
|
struct f2fs_io_info fio = {
|
2015-04-24 00:38:15 +03:00
|
|
|
.sbi = sbi,
|
2013-12-11 08:54:01 +04:00
|
|
|
.type = META,
|
2016-06-05 22:31:55 +03:00
|
|
|
.op = REQ_OP_WRITE,
|
2016-11-01 16:40:10 +03:00
|
|
|
.op_flags = REQ_SYNC | REQ_META | REQ_PRIO,
|
f2fs: trace old block address for CoWed page
This patch enables to trace old block address of CoWed page for better
debugging.
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f0, oldaddr = 0xfe8ab, newaddr = 0xfee90 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f8, oldaddr = 0xfe8b0, newaddr = 0xfee91 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4fa, oldaddr = 0xfe8ae, newaddr = 0xfee92 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x96, oldaddr = 0xf049b, newaddr = 0x2bbe rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x97, oldaddr = 0xf049c, newaddr = 0x2bbf rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x98, oldaddr = 0xf049d, newaddr = 0x2bc0 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x47, oldaddr = 0xffffffff, newaddr = 0xf2631 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x48, oldaddr = 0xffffffff, newaddr = 0xf2632 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x49, oldaddr = 0xffffffff, newaddr = 0xf2633 rw = WRITE, type = DATA
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-22 13:36:38 +03:00
|
|
|
.old_blkaddr = page->index,
|
|
|
|
.new_blkaddr = page->index,
|
2015-04-24 00:38:15 +03:00
|
|
|
.page = page,
|
2015-04-23 22:04:33 +03:00
|
|
|
.encrypted_page = NULL,
|
2013-12-11 08:54:01 +04:00
|
|
|
};
|
|
|
|
|
2015-10-12 12:04:21 +03:00
|
|
|
if (unlikely(page->index >= MAIN_BLKADDR(sbi)))
|
2016-06-05 22:31:55 +03:00
|
|
|
fio.op_flags &= ~REQ_META;
|
2015-10-12 12:04:21 +03:00
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
set_page_writeback(page);
|
2015-04-24 00:38:15 +03:00
|
|
|
f2fs_submit_page_mbio(&fio);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
2015-04-24 00:38:15 +03:00
|
|
|
void write_node_page(unsigned int nid, struct f2fs_io_info *fio)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
|
|
|
struct f2fs_summary sum;
|
2015-04-24 00:38:15 +03:00
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
set_summary(&sum, nid, 0, 0);
|
2015-04-24 00:38:15 +03:00
|
|
|
do_write_page(&sum, fio);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
2015-04-24 00:38:15 +03:00
|
|
|
void write_data_page(struct dnode_of_data *dn, struct f2fs_io_info *fio)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
2015-04-24 00:38:15 +03:00
|
|
|
struct f2fs_sb_info *sbi = fio->sbi;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
struct f2fs_summary sum;
|
|
|
|
struct node_info ni;
|
|
|
|
|
2014-09-03 02:52:58 +04:00
|
|
|
f2fs_bug_on(sbi, dn->data_blkaddr == NULL_ADDR);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
get_node_info(sbi, dn->nid, &ni);
|
|
|
|
set_summary(&sum, dn->nid, dn->ofs_in_node, ni.version);
|
2015-04-24 00:38:15 +03:00
|
|
|
do_write_page(&sum, fio);
|
2016-02-24 12:16:47 +03:00
|
|
|
f2fs_update_data_blkaddr(dn, fio->new_blkaddr);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
2015-04-24 00:38:15 +03:00
|
|
|
void rewrite_data_page(struct f2fs_io_info *fio)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
f2fs: trace old block address for CoWed page
This patch enables to trace old block address of CoWed page for better
debugging.
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f0, oldaddr = 0xfe8ab, newaddr = 0xfee90 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4f8, oldaddr = 0xfe8b0, newaddr = 0xfee91 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 1, page_index = 0x1d4fa, oldaddr = 0xfe8ae, newaddr = 0xfee92 rw = WRITE_SYNC, type = NODE
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x96, oldaddr = 0xf049b, newaddr = 0x2bbe rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x97, oldaddr = 0xf049c, newaddr = 0x2bbf rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 134824, page_index = 0x98, oldaddr = 0xf049d, newaddr = 0x2bc0 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x47, oldaddr = 0xffffffff, newaddr = 0xf2631 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x48, oldaddr = 0xffffffff, newaddr = 0xf2632 rw = WRITE, type = DATA
f2fs_submit_page_mbio: dev = (1,0), ino = 135260, page_index = 0x49, oldaddr = 0xffffffff, newaddr = 0xf2633 rw = WRITE, type = DATA
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-22 13:36:38 +03:00
|
|
|
fio->new_blkaddr = fio->old_blkaddr;
|
2015-04-24 00:38:15 +03:00
|
|
|
stat_inc_inplace_blocks(fio->sbi);
|
|
|
|
f2fs_submit_page_mbio(fio);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
2016-02-23 12:52:43 +03:00
|
|
|
void __f2fs_replace_block(struct f2fs_sb_info *sbi, struct f2fs_summary *sum,
|
2015-05-06 08:08:06 +03:00
|
|
|
block_t old_blkaddr, block_t new_blkaddr,
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
bool recover_curseg, bool recover_newaddr)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
|
|
|
struct sit_info *sit_i = SIT_I(sbi);
|
|
|
|
struct curseg_info *curseg;
|
|
|
|
unsigned int segno, old_cursegno;
|
|
|
|
struct seg_entry *se;
|
|
|
|
int type;
|
2015-05-06 08:08:06 +03:00
|
|
|
unsigned short old_blkoff;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
segno = GET_SEGNO(sbi, new_blkaddr);
|
|
|
|
se = get_seg_entry(sbi, segno);
|
|
|
|
type = se->type;
|
|
|
|
|
2015-05-06 08:08:06 +03:00
|
|
|
if (!recover_curseg) {
|
|
|
|
/* for recovery flow */
|
|
|
|
if (se->valid_blocks == 0 && !IS_CURSEG(sbi, segno)) {
|
|
|
|
if (old_blkaddr == NULL_ADDR)
|
|
|
|
type = CURSEG_COLD_DATA;
|
|
|
|
else
|
|
|
|
type = CURSEG_WARM_DATA;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
if (!IS_CURSEG(sbi, segno))
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
type = CURSEG_WARM_DATA;
|
|
|
|
}
|
2015-05-06 08:08:06 +03:00
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
curseg = CURSEG_I(sbi, type);
|
|
|
|
|
|
|
|
mutex_lock(&curseg->curseg_mutex);
|
|
|
|
mutex_lock(&sit_i->sentry_lock);
|
|
|
|
|
|
|
|
old_cursegno = curseg->segno;
|
2015-05-06 08:08:06 +03:00
|
|
|
old_blkoff = curseg->next_blkoff;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
/* change the current segment */
|
|
|
|
if (segno != curseg->segno) {
|
|
|
|
curseg->next_segno = segno;
|
|
|
|
change_curseg(sbi, type, true);
|
|
|
|
}
|
|
|
|
|
2014-02-04 08:01:10 +04:00
|
|
|
curseg->next_blkoff = GET_BLKOFF_FROM_SEG0(sbi, new_blkaddr);
|
2013-06-13 12:59:27 +04:00
|
|
|
__add_sum_entry(sbi, type, sum);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
if (!recover_curseg || recover_newaddr)
|
2015-10-07 22:28:41 +03:00
|
|
|
update_sit_entry(sbi, new_blkaddr, 1);
|
|
|
|
if (GET_SEGNO(sbi, old_blkaddr) != NULL_SEGNO)
|
|
|
|
update_sit_entry(sbi, old_blkaddr, -1);
|
|
|
|
|
|
|
|
locate_dirty_segment(sbi, GET_SEGNO(sbi, old_blkaddr));
|
|
|
|
locate_dirty_segment(sbi, GET_SEGNO(sbi, new_blkaddr));
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
locate_dirty_segment(sbi, old_cursegno);
|
|
|
|
|
2015-05-06 08:08:06 +03:00
|
|
|
if (recover_curseg) {
|
|
|
|
if (old_cursegno != curseg->segno) {
|
|
|
|
curseg->next_segno = old_cursegno;
|
|
|
|
change_curseg(sbi, type, true);
|
|
|
|
}
|
|
|
|
curseg->next_blkoff = old_blkoff;
|
|
|
|
}
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
mutex_unlock(&sit_i->sentry_lock);
|
|
|
|
mutex_unlock(&curseg->curseg_mutex);
|
|
|
|
}
|
|
|
|
|
2015-05-28 14:15:35 +03:00
|
|
|
void f2fs_replace_block(struct f2fs_sb_info *sbi, struct dnode_of_data *dn,
|
|
|
|
block_t old_addr, block_t new_addr,
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
unsigned char version, bool recover_curseg,
|
|
|
|
bool recover_newaddr)
|
2015-05-28 14:15:35 +03:00
|
|
|
{
|
|
|
|
struct f2fs_summary sum;
|
|
|
|
|
|
|
|
set_summary(&sum, dn->nid, dn->ofs_in_node, version);
|
|
|
|
|
f2fs: support revoking atomic written pages
f2fs support atomic write with following semantics:
1. open db file
2. ioctl start atomic write
3. (write db file) * n
4. ioctl commit atomic write
5. close db file
With this flow we can avoid file becoming corrupted when abnormal power
cut, because we hold data of transaction in referenced pages linked in
inmem_pages list of inode, but without setting them dirty, so these data
won't be persisted unless we commit them in step 4.
But we should still hold journal db file in memory by using volatile
write, because our semantics of 'atomic write support' is incomplete, in
step 4, we could fail to submit all dirty data of transaction, once
partial dirty data was committed in storage, then after a checkpoint &
abnormal power-cut, db file will be corrupted forever.
So this patch tries to improve atomic write flow by adding a revoking flow,
once inner error occurs in committing, this gives another chance to try to
revoke these partial submitted data of current transaction, it makes
committing operation more like aotmical one.
If we're not lucky, once revoking operation was failed, EAGAIN will be
reported to user for suggesting doing the recovery with held journal file,
or retrying current transaction again.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-06 09:40:34 +03:00
|
|
|
__f2fs_replace_block(sbi, &sum, old_addr, new_addr,
|
|
|
|
recover_curseg, recover_newaddr);
|
2015-05-28 14:15:35 +03:00
|
|
|
|
2016-02-24 12:16:47 +03:00
|
|
|
f2fs_update_data_blkaddr(dn, new_addr);
|
2015-05-28 14:15:35 +03:00
|
|
|
}
|
|
|
|
|
2013-11-30 07:51:14 +04:00
|
|
|
void f2fs_wait_on_page_writeback(struct page *page,
|
2016-01-20 18:43:51 +03:00
|
|
|
enum page_type type, bool ordered)
|
2013-11-30 07:51:14 +04:00
|
|
|
{
|
|
|
|
if (PageWriteback(page)) {
|
2014-09-03 02:31:18 +04:00
|
|
|
struct f2fs_sb_info *sbi = F2FS_P_SB(page);
|
|
|
|
|
f2fs: introduce f2fs_submit_merged_bio_cond
f2fs use single bio buffer per type data (META/NODE/DATA) for caching
writes locating in continuous block address as many as possible, after
submitting, these writes may be still cached in bio buffer, so we have
to flush cached writes in bio buffer by calling f2fs_submit_merged_bio.
Unfortunately, in the scenario of high concurrency, bio buffer could be
flushed by someone else before we submit it as below reasons:
a) there is no space in bio buffer.
b) add a request of different type (SYNC, ASYNC).
c) add a discontinuous block address.
For this condition, f2fs_submit_merged_bio will be devastating, because
it could break the following merging of writes in bio buffer, split one
big bio into two smaller one.
This patch introduces f2fs_submit_merged_bio_cond which can do a
conditional submitting with bio buffer, before submitting it will judge
whether:
- page in DATA type bio buffer is matching with specified page;
- page in DATA type bio buffer is belong to specified inode;
- page in NODE type bio buffer is belong to specified inode;
If there is no eligible page in bio buffer, we will skip submitting step,
result in gaining more chance to merge consecutive block IOs in bio cache.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-01-18 13:28:11 +03:00
|
|
|
f2fs_submit_merged_bio_cond(sbi, NULL, page, 0, type, WRITE);
|
2016-01-20 18:43:51 +03:00
|
|
|
if (ordered)
|
|
|
|
wait_on_page_writeback(page);
|
|
|
|
else
|
|
|
|
wait_for_stable_page(page);
|
2013-11-30 07:51:14 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2015-10-08 08:27:34 +03:00
|
|
|
void f2fs_wait_on_encrypted_page_writeback(struct f2fs_sb_info *sbi,
|
|
|
|
block_t blkaddr)
|
|
|
|
{
|
|
|
|
struct page *cpage;
|
|
|
|
|
2016-09-18 03:16:56 +03:00
|
|
|
if (blkaddr == NEW_ADDR || blkaddr == NULL_ADDR)
|
2015-10-08 08:27:34 +03:00
|
|
|
return;
|
|
|
|
|
|
|
|
cpage = find_lock_page(META_MAPPING(sbi), blkaddr);
|
|
|
|
if (cpage) {
|
2016-01-20 18:43:51 +03:00
|
|
|
f2fs_wait_on_page_writeback(cpage, DATA, true);
|
2015-10-08 08:27:34 +03:00
|
|
|
f2fs_put_page(cpage, 1);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
static int read_compacted_summaries(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
|
|
|
|
struct curseg_info *seg_i;
|
|
|
|
unsigned char *kaddr;
|
|
|
|
struct page *page;
|
|
|
|
block_t start;
|
|
|
|
int i, j, offset;
|
|
|
|
|
|
|
|
start = start_sum_block(sbi);
|
|
|
|
|
|
|
|
page = get_meta_page(sbi, start++);
|
|
|
|
kaddr = (unsigned char *)page_address(page);
|
|
|
|
|
|
|
|
/* Step 1: restore nat cache */
|
|
|
|
seg_i = CURSEG_I(sbi, CURSEG_HOT_DATA);
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
memcpy(seg_i->journal, kaddr, SUM_JOURNAL_SIZE);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
/* Step 2: restore sit cache */
|
|
|
|
seg_i = CURSEG_I(sbi, CURSEG_COLD_DATA);
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
memcpy(seg_i->journal, kaddr + SUM_JOURNAL_SIZE, SUM_JOURNAL_SIZE);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
offset = 2 * SUM_JOURNAL_SIZE;
|
|
|
|
|
|
|
|
/* Step 3: restore summary entries */
|
|
|
|
for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) {
|
|
|
|
unsigned short blk_off;
|
|
|
|
unsigned int segno;
|
|
|
|
|
|
|
|
seg_i = CURSEG_I(sbi, i);
|
|
|
|
segno = le32_to_cpu(ckpt->cur_data_segno[i]);
|
|
|
|
blk_off = le16_to_cpu(ckpt->cur_data_blkoff[i]);
|
|
|
|
seg_i->next_segno = segno;
|
|
|
|
reset_curseg(sbi, i, 0);
|
|
|
|
seg_i->alloc_type = ckpt->alloc_type[i];
|
|
|
|
seg_i->next_blkoff = blk_off;
|
|
|
|
|
|
|
|
if (seg_i->alloc_type == SSR)
|
|
|
|
blk_off = sbi->blocks_per_seg;
|
|
|
|
|
|
|
|
for (j = 0; j < blk_off; j++) {
|
|
|
|
struct f2fs_summary *s;
|
|
|
|
s = (struct f2fs_summary *)(kaddr + offset);
|
|
|
|
seg_i->sum_blk->entries[j] = *s;
|
|
|
|
offset += SUMMARY_SIZE;
|
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 15:29:47 +03:00
|
|
|
if (offset + SUMMARY_SIZE <= PAGE_SIZE -
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
SUM_FOOTER_SIZE)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
f2fs_put_page(page, 1);
|
|
|
|
page = NULL;
|
|
|
|
|
|
|
|
page = get_meta_page(sbi, start++);
|
|
|
|
kaddr = (unsigned char *)page_address(page);
|
|
|
|
offset = 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
f2fs_put_page(page, 1);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int read_normal_summaries(struct f2fs_sb_info *sbi, int type)
|
|
|
|
{
|
|
|
|
struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
|
|
|
|
struct f2fs_summary_block *sum;
|
|
|
|
struct curseg_info *curseg;
|
|
|
|
struct page *new;
|
|
|
|
unsigned short blk_off;
|
|
|
|
unsigned int segno = 0;
|
|
|
|
block_t blk_addr = 0;
|
|
|
|
|
|
|
|
/* get segment number and block addr */
|
|
|
|
if (IS_DATASEG(type)) {
|
|
|
|
segno = le32_to_cpu(ckpt->cur_data_segno[type]);
|
|
|
|
blk_off = le16_to_cpu(ckpt->cur_data_blkoff[type -
|
|
|
|
CURSEG_HOT_DATA]);
|
2015-01-29 22:45:33 +03:00
|
|
|
if (__exist_node_summaries(sbi))
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
blk_addr = sum_blk_addr(sbi, NR_CURSEG_TYPE, type);
|
|
|
|
else
|
|
|
|
blk_addr = sum_blk_addr(sbi, NR_CURSEG_DATA_TYPE, type);
|
|
|
|
} else {
|
|
|
|
segno = le32_to_cpu(ckpt->cur_node_segno[type -
|
|
|
|
CURSEG_HOT_NODE]);
|
|
|
|
blk_off = le16_to_cpu(ckpt->cur_node_blkoff[type -
|
|
|
|
CURSEG_HOT_NODE]);
|
2015-01-29 22:45:33 +03:00
|
|
|
if (__exist_node_summaries(sbi))
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
blk_addr = sum_blk_addr(sbi, NR_CURSEG_NODE_TYPE,
|
|
|
|
type - CURSEG_HOT_NODE);
|
|
|
|
else
|
|
|
|
blk_addr = GET_SUM_BLOCK(sbi, segno);
|
|
|
|
}
|
|
|
|
|
|
|
|
new = get_meta_page(sbi, blk_addr);
|
|
|
|
sum = (struct f2fs_summary_block *)page_address(new);
|
|
|
|
|
|
|
|
if (IS_NODESEG(type)) {
|
2015-01-29 22:45:33 +03:00
|
|
|
if (__exist_node_summaries(sbi)) {
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
struct f2fs_summary *ns = &sum->entries[0];
|
|
|
|
int i;
|
|
|
|
for (i = 0; i < sbi->blocks_per_seg; i++, ns++) {
|
|
|
|
ns->version = 0;
|
|
|
|
ns->ofs_in_node = 0;
|
|
|
|
}
|
|
|
|
} else {
|
2014-03-07 14:43:36 +04:00
|
|
|
int err;
|
|
|
|
|
|
|
|
err = restore_node_summary(sbi, segno, sum);
|
|
|
|
if (err) {
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
f2fs_put_page(new, 1);
|
2014-03-07 14:43:36 +04:00
|
|
|
return err;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* set uncompleted segment to curseg */
|
|
|
|
curseg = CURSEG_I(sbi, type);
|
|
|
|
mutex_lock(&curseg->curseg_mutex);
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
|
|
|
|
/* update journal info */
|
|
|
|
down_write(&curseg->journal_rwsem);
|
|
|
|
memcpy(curseg->journal, &sum->journal, SUM_JOURNAL_SIZE);
|
|
|
|
up_write(&curseg->journal_rwsem);
|
|
|
|
|
|
|
|
memcpy(curseg->sum_blk->entries, sum->entries, SUM_ENTRY_SIZE);
|
|
|
|
memcpy(&curseg->sum_blk->footer, &sum->footer, SUM_FOOTER_SIZE);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
curseg->next_segno = segno;
|
|
|
|
reset_curseg(sbi, type, 0);
|
|
|
|
curseg->alloc_type = ckpt->alloc_type[type];
|
|
|
|
curseg->next_blkoff = blk_off;
|
|
|
|
mutex_unlock(&curseg->curseg_mutex);
|
|
|
|
f2fs_put_page(new, 1);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int restore_curseg_summaries(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
int type = CURSEG_HOT_DATA;
|
2014-03-17 12:36:24 +04:00
|
|
|
int err;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
2016-09-20 06:04:18 +03:00
|
|
|
if (is_set_ckpt_flags(sbi, CP_COMPACT_SUM_FLAG)) {
|
2014-12-09 09:21:46 +03:00
|
|
|
int npages = npages_for_summary_flush(sbi, true);
|
|
|
|
|
|
|
|
if (npages >= 2)
|
|
|
|
ra_meta_pages(sbi, start_sum_block(sbi), npages,
|
2015-10-12 12:05:59 +03:00
|
|
|
META_CP, true);
|
2014-12-09 09:21:46 +03:00
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
/* restore for compacted data summary */
|
|
|
|
if (read_compacted_summaries(sbi))
|
|
|
|
return -EINVAL;
|
|
|
|
type = CURSEG_HOT_NODE;
|
|
|
|
}
|
|
|
|
|
2015-01-29 22:45:33 +03:00
|
|
|
if (__exist_node_summaries(sbi))
|
2014-12-09 09:21:46 +03:00
|
|
|
ra_meta_pages(sbi, sum_blk_addr(sbi, NR_CURSEG_TYPE, type),
|
2015-10-12 12:05:59 +03:00
|
|
|
NR_CURSEG_TYPE - type, META_CP, true);
|
2014-12-09 09:21:46 +03:00
|
|
|
|
2014-03-17 12:36:24 +04:00
|
|
|
for (; type <= CURSEG_COLD_NODE; type++) {
|
|
|
|
err = read_normal_summaries(sbi, type);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void write_compacted_summaries(struct f2fs_sb_info *sbi, block_t blkaddr)
|
|
|
|
{
|
|
|
|
struct page *page;
|
|
|
|
unsigned char *kaddr;
|
|
|
|
struct f2fs_summary *summary;
|
|
|
|
struct curseg_info *seg_i;
|
|
|
|
int written_size = 0;
|
|
|
|
int i, j;
|
|
|
|
|
|
|
|
page = grab_meta_page(sbi, blkaddr++);
|
|
|
|
kaddr = (unsigned char *)page_address(page);
|
|
|
|
|
|
|
|
/* Step 1: write nat cache */
|
|
|
|
seg_i = CURSEG_I(sbi, CURSEG_HOT_DATA);
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
memcpy(kaddr, seg_i->journal, SUM_JOURNAL_SIZE);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
written_size += SUM_JOURNAL_SIZE;
|
|
|
|
|
|
|
|
/* Step 2: write sit cache */
|
|
|
|
seg_i = CURSEG_I(sbi, CURSEG_COLD_DATA);
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
memcpy(kaddr + written_size, seg_i->journal, SUM_JOURNAL_SIZE);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
written_size += SUM_JOURNAL_SIZE;
|
|
|
|
|
|
|
|
/* Step 3: write summary entries */
|
|
|
|
for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_DATA; i++) {
|
|
|
|
unsigned short blkoff;
|
|
|
|
seg_i = CURSEG_I(sbi, i);
|
|
|
|
if (sbi->ckpt->alloc_type[i] == SSR)
|
|
|
|
blkoff = sbi->blocks_per_seg;
|
|
|
|
else
|
|
|
|
blkoff = curseg_blkoff(sbi, i);
|
|
|
|
|
|
|
|
for (j = 0; j < blkoff; j++) {
|
|
|
|
if (!page) {
|
|
|
|
page = grab_meta_page(sbi, blkaddr++);
|
|
|
|
kaddr = (unsigned char *)page_address(page);
|
|
|
|
written_size = 0;
|
|
|
|
}
|
|
|
|
summary = (struct f2fs_summary *)(kaddr + written_size);
|
|
|
|
*summary = seg_i->sum_blk->entries[j];
|
|
|
|
written_size += SUMMARY_SIZE;
|
|
|
|
|
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 15:29:47 +03:00
|
|
|
if (written_size + SUMMARY_SIZE <= PAGE_SIZE -
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
SUM_FOOTER_SIZE)
|
|
|
|
continue;
|
|
|
|
|
2013-10-24 11:08:28 +04:00
|
|
|
set_page_dirty(page);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
f2fs_put_page(page, 1);
|
|
|
|
page = NULL;
|
|
|
|
}
|
|
|
|
}
|
2013-10-24 11:08:28 +04:00
|
|
|
if (page) {
|
|
|
|
set_page_dirty(page);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
f2fs_put_page(page, 1);
|
2013-10-24 11:08:28 +04:00
|
|
|
}
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void write_normal_summaries(struct f2fs_sb_info *sbi,
|
|
|
|
block_t blkaddr, int type)
|
|
|
|
{
|
|
|
|
int i, end;
|
|
|
|
if (IS_DATASEG(type))
|
|
|
|
end = type + NR_CURSEG_DATA_TYPE;
|
|
|
|
else
|
|
|
|
end = type + NR_CURSEG_NODE_TYPE;
|
|
|
|
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
for (i = type; i < end; i++)
|
|
|
|
write_current_sum_page(sbi, i, blkaddr + (i - type));
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
void write_data_summaries(struct f2fs_sb_info *sbi, block_t start_blk)
|
|
|
|
{
|
2016-09-20 06:04:18 +03:00
|
|
|
if (is_set_ckpt_flags(sbi, CP_COMPACT_SUM_FLAG))
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
write_compacted_summaries(sbi, start_blk);
|
|
|
|
else
|
|
|
|
write_normal_summaries(sbi, start_blk, CURSEG_HOT_DATA);
|
|
|
|
}
|
|
|
|
|
|
|
|
void write_node_summaries(struct f2fs_sb_info *sbi, block_t start_blk)
|
|
|
|
{
|
2015-01-29 22:45:33 +03:00
|
|
|
write_normal_summaries(sbi, start_blk, CURSEG_HOT_NODE);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
2016-02-14 13:50:40 +03:00
|
|
|
int lookup_journal_in_cursum(struct f2fs_journal *journal, int type,
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
unsigned int val, int alloc)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
if (type == NAT_JOURNAL) {
|
2016-02-14 13:50:40 +03:00
|
|
|
for (i = 0; i < nats_in_cursum(journal); i++) {
|
|
|
|
if (le32_to_cpu(nid_in_journal(journal, i)) == val)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
return i;
|
|
|
|
}
|
2016-02-14 13:50:40 +03:00
|
|
|
if (alloc && __has_cursum_space(journal, 1, NAT_JOURNAL))
|
|
|
|
return update_nats_in_cursum(journal, 1);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
} else if (type == SIT_JOURNAL) {
|
2016-02-14 13:50:40 +03:00
|
|
|
for (i = 0; i < sits_in_cursum(journal); i++)
|
|
|
|
if (le32_to_cpu(segno_in_journal(journal, i)) == val)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
return i;
|
2016-02-14 13:50:40 +03:00
|
|
|
if (alloc && __has_cursum_space(journal, 1, SIT_JOURNAL))
|
|
|
|
return update_sits_in_cursum(journal, 1);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct page *get_current_sit_page(struct f2fs_sb_info *sbi,
|
|
|
|
unsigned int segno)
|
|
|
|
{
|
2014-10-20 13:45:49 +04:00
|
|
|
return get_meta_page(sbi, current_sit_addr(sbi, segno));
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static struct page *get_next_sit_page(struct f2fs_sb_info *sbi,
|
|
|
|
unsigned int start)
|
|
|
|
{
|
|
|
|
struct sit_info *sit_i = SIT_I(sbi);
|
|
|
|
struct page *src_page, *dst_page;
|
|
|
|
pgoff_t src_off, dst_off;
|
|
|
|
void *src_addr, *dst_addr;
|
|
|
|
|
|
|
|
src_off = current_sit_addr(sbi, start);
|
|
|
|
dst_off = next_sit_addr(sbi, src_off);
|
|
|
|
|
|
|
|
/* get current sit block page without lock */
|
|
|
|
src_page = get_meta_page(sbi, src_off);
|
|
|
|
dst_page = grab_meta_page(sbi, dst_off);
|
2014-09-03 02:52:58 +04:00
|
|
|
f2fs_bug_on(sbi, PageDirty(src_page));
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
src_addr = page_address(src_page);
|
|
|
|
dst_addr = page_address(dst_page);
|
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 15:29:47 +03:00
|
|
|
memcpy(dst_addr, src_addr, PAGE_SIZE);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
set_page_dirty(dst_page);
|
|
|
|
f2fs_put_page(src_page, 1);
|
|
|
|
|
|
|
|
set_to_next_sit(sit_i, start);
|
|
|
|
|
|
|
|
return dst_page;
|
|
|
|
}
|
|
|
|
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
static struct sit_entry_set *grab_sit_entry_set(void)
|
|
|
|
{
|
|
|
|
struct sit_entry_set *ses =
|
2015-08-20 18:51:56 +03:00
|
|
|
f2fs_kmem_cache_alloc(sit_entry_set_slab, GFP_NOFS);
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
|
|
|
|
ses->entry_cnt = 0;
|
|
|
|
INIT_LIST_HEAD(&ses->set_list);
|
|
|
|
return ses;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void release_sit_entry_set(struct sit_entry_set *ses)
|
|
|
|
{
|
|
|
|
list_del(&ses->set_list);
|
|
|
|
kmem_cache_free(sit_entry_set_slab, ses);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void adjust_sit_entry_set(struct sit_entry_set *ses,
|
|
|
|
struct list_head *head)
|
|
|
|
{
|
|
|
|
struct sit_entry_set *next = ses;
|
|
|
|
|
|
|
|
if (list_is_last(&ses->set_list, head))
|
|
|
|
return;
|
|
|
|
|
|
|
|
list_for_each_entry_continue(next, head, set_list)
|
|
|
|
if (ses->entry_cnt <= next->entry_cnt)
|
|
|
|
break;
|
|
|
|
|
|
|
|
list_move_tail(&ses->set_list, &next->set_list);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void add_sit_entry(unsigned int segno, struct list_head *head)
|
|
|
|
{
|
|
|
|
struct sit_entry_set *ses;
|
|
|
|
unsigned int start_segno = START_SEGNO(segno);
|
|
|
|
|
|
|
|
list_for_each_entry(ses, head, set_list) {
|
|
|
|
if (ses->start_segno == start_segno) {
|
|
|
|
ses->entry_cnt++;
|
|
|
|
adjust_sit_entry_set(ses, head);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
ses = grab_sit_entry_set();
|
|
|
|
|
|
|
|
ses->start_segno = start_segno;
|
|
|
|
ses->entry_cnt++;
|
|
|
|
list_add(&ses->set_list, head);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void add_sits_in_set(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct f2fs_sm_info *sm_info = SM_I(sbi);
|
|
|
|
struct list_head *set_list = &sm_info->sit_entry_set;
|
|
|
|
unsigned long *bitmap = SIT_I(sbi)->dirty_sentries_bitmap;
|
|
|
|
unsigned int segno;
|
|
|
|
|
2014-09-23 22:23:01 +04:00
|
|
|
for_each_set_bit(segno, bitmap, MAIN_SEGS(sbi))
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
add_sit_entry(segno, set_list);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void remove_sits_in_journal(struct f2fs_sb_info *sbi)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_COLD_DATA);
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
struct f2fs_journal *journal = curseg->journal;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
int i;
|
|
|
|
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
down_write(&curseg->journal_rwsem);
|
2016-02-14 13:50:40 +03:00
|
|
|
for (i = 0; i < sits_in_cursum(journal); i++) {
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
unsigned int segno;
|
|
|
|
bool dirtied;
|
|
|
|
|
2016-02-14 13:50:40 +03:00
|
|
|
segno = le32_to_cpu(segno_in_journal(journal, i));
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
dirtied = __mark_sit_entry_dirty(sbi, segno);
|
|
|
|
|
|
|
|
if (!dirtied)
|
|
|
|
add_sit_entry(segno, &SM_I(sbi)->sit_entry_set);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
2016-02-14 13:50:40 +03:00
|
|
|
update_sits_in_cursum(journal, -i);
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
up_write(&curseg->journal_rwsem);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
2012-11-29 08:28:09 +04:00
|
|
|
/*
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* CP calls this function, which flushes SIT entries including sit_journal,
|
|
|
|
* and moves prefree segs to free segs.
|
|
|
|
*/
|
2014-09-21 09:06:39 +04:00
|
|
|
void flush_sit_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
|
|
|
struct sit_info *sit_i = SIT_I(sbi);
|
|
|
|
unsigned long *bitmap = sit_i->dirty_sentries_bitmap;
|
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_COLD_DATA);
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
struct f2fs_journal *journal = curseg->journal;
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
struct sit_entry_set *ses, *tmp;
|
|
|
|
struct list_head *head = &SM_I(sbi)->sit_entry_set;
|
|
|
|
bool to_journal = true;
|
2014-09-21 09:06:39 +04:00
|
|
|
struct seg_entry *se;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
mutex_lock(&sit_i->sentry_lock);
|
|
|
|
|
2015-02-27 11:52:50 +03:00
|
|
|
if (!sit_i->dirty_sentries)
|
|
|
|
goto out;
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
/*
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
* add and account sit entries of dirty bitmap in sit entry
|
|
|
|
* set temporarily
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
*/
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
add_sits_in_set(sbi);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
/*
|
|
|
|
* if there are no enough space in journal to store dirty sit
|
|
|
|
* entries, remove all entries from journal and add and account
|
|
|
|
* them in sit entry set.
|
|
|
|
*/
|
2016-02-14 13:50:40 +03:00
|
|
|
if (!__has_cursum_space(journal, sit_i->dirty_sentries, SIT_JOURNAL))
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
remove_sits_in_journal(sbi);
|
2013-11-12 09:49:56 +04:00
|
|
|
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
/*
|
|
|
|
* there are two steps to flush sit entries:
|
|
|
|
* #1, flush sit entries to journal in current cold data summary block.
|
|
|
|
* #2, flush sit entries to sit page.
|
|
|
|
*/
|
|
|
|
list_for_each_entry_safe(ses, tmp, head, set_list) {
|
2014-10-16 22:43:30 +04:00
|
|
|
struct page *page = NULL;
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
struct f2fs_sit_block *raw_sit = NULL;
|
|
|
|
unsigned int start_segno = ses->start_segno;
|
|
|
|
unsigned int end = min(start_segno + SIT_ENTRY_PER_BLOCK,
|
2014-09-23 22:23:01 +04:00
|
|
|
(unsigned long)MAIN_SEGS(sbi));
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
unsigned int segno = start_segno;
|
|
|
|
|
|
|
|
if (to_journal &&
|
2016-02-14 13:50:40 +03:00
|
|
|
!__has_cursum_space(journal, ses->entry_cnt, SIT_JOURNAL))
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
to_journal = false;
|
|
|
|
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
if (to_journal) {
|
|
|
|
down_write(&curseg->journal_rwsem);
|
|
|
|
} else {
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
page = get_next_sit_page(sbi, start_segno);
|
|
|
|
raw_sit = page_address(page);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
/* flush dirty sit entries in region of current sit set */
|
|
|
|
for_each_set_bit_from(segno, bitmap, end) {
|
|
|
|
int offset, sit_offset;
|
2014-09-21 09:06:39 +04:00
|
|
|
|
|
|
|
se = get_seg_entry(sbi, segno);
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
|
|
|
|
/* add discard candidates */
|
2014-12-13 00:53:41 +03:00
|
|
|
if (cpc->reason != CP_DISCARD) {
|
2014-09-21 09:06:39 +04:00
|
|
|
cpc->trim_start = segno;
|
2016-12-30 09:06:15 +03:00
|
|
|
add_discard_addrs(sbi, cpc, false);
|
2014-09-21 09:06:39 +04:00
|
|
|
}
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
|
|
|
|
if (to_journal) {
|
2016-02-14 13:50:40 +03:00
|
|
|
offset = lookup_journal_in_cursum(journal,
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
SIT_JOURNAL, segno, 1);
|
|
|
|
f2fs_bug_on(sbi, offset < 0);
|
2016-02-14 13:50:40 +03:00
|
|
|
segno_in_journal(journal, offset) =
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
cpu_to_le32(segno);
|
|
|
|
seg_info_to_raw_sit(se,
|
2016-02-14 13:50:40 +03:00
|
|
|
&sit_in_journal(journal, offset));
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
} else {
|
|
|
|
sit_offset = SIT_ENTRY_OFFSET(sit_i, segno);
|
|
|
|
seg_info_to_raw_sit(se,
|
|
|
|
&raw_sit->entries[sit_offset]);
|
|
|
|
}
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
__clear_bit(segno, bitmap);
|
|
|
|
sit_i->dirty_sentries--;
|
|
|
|
ses->entry_cnt--;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
if (to_journal)
|
|
|
|
up_write(&curseg->journal_rwsem);
|
|
|
|
else
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
f2fs_put_page(page, 1);
|
|
|
|
|
|
|
|
f2fs_bug_on(sbi, ses->entry_cnt);
|
|
|
|
release_sit_entry_set(ses);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
|
|
|
|
f2fs_bug_on(sbi, !list_empty(head));
|
|
|
|
f2fs_bug_on(sbi, sit_i->dirty_sentries);
|
|
|
|
out:
|
2014-09-21 09:06:39 +04:00
|
|
|
if (cpc->reason == CP_DISCARD) {
|
2016-12-22 06:46:24 +03:00
|
|
|
__u64 trim_start = cpc->trim_start;
|
|
|
|
|
2014-09-21 09:06:39 +04:00
|
|
|
for (; cpc->trim_start <= cpc->trim_end; cpc->trim_start++)
|
2016-12-30 09:06:15 +03:00
|
|
|
add_discard_addrs(sbi, cpc, false);
|
2016-12-22 06:46:24 +03:00
|
|
|
|
|
|
|
cpc->trim_start = trim_start;
|
2014-09-21 09:06:39 +04:00
|
|
|
}
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
mutex_unlock(&sit_i->sentry_lock);
|
|
|
|
|
|
|
|
set_prefree_as_free_segments(sbi);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int build_sit_info(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct f2fs_super_block *raw_super = F2FS_RAW_SUPER(sbi);
|
|
|
|
struct sit_info *sit_i;
|
|
|
|
unsigned int sit_segs, start;
|
|
|
|
char *src_bitmap, *dst_bitmap;
|
|
|
|
unsigned int bitmap_size;
|
|
|
|
|
|
|
|
/* allocate memory for SIT information */
|
|
|
|
sit_i = kzalloc(sizeof(struct sit_info), GFP_KERNEL);
|
|
|
|
if (!sit_i)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
SM_I(sbi)->sit_info = sit_i;
|
|
|
|
|
2015-09-22 23:50:47 +03:00
|
|
|
sit_i->sentries = f2fs_kvzalloc(MAIN_SEGS(sbi) *
|
|
|
|
sizeof(struct seg_entry), GFP_KERNEL);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (!sit_i->sentries)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
2014-09-23 22:23:01 +04:00
|
|
|
bitmap_size = f2fs_bitmap_size(MAIN_SEGS(sbi));
|
2015-09-22 23:50:47 +03:00
|
|
|
sit_i->dirty_sentries_bitmap = f2fs_kvzalloc(bitmap_size, GFP_KERNEL);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (!sit_i->dirty_sentries_bitmap)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
2014-09-23 22:23:01 +04:00
|
|
|
for (start = 0; start < MAIN_SEGS(sbi); start++) {
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
sit_i->sentries[start].cur_valid_map
|
|
|
|
= kzalloc(SIT_VBLOCK_MAP_SIZE, GFP_KERNEL);
|
|
|
|
sit_i->sentries[start].ckpt_valid_map
|
|
|
|
= kzalloc(SIT_VBLOCK_MAP_SIZE, GFP_KERNEL);
|
2015-05-01 08:37:50 +03:00
|
|
|
if (!sit_i->sentries[start].cur_valid_map ||
|
2016-08-02 20:56:40 +03:00
|
|
|
!sit_i->sentries[start].ckpt_valid_map)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
return -ENOMEM;
|
2016-08-02 20:56:40 +03:00
|
|
|
|
2017-01-07 13:51:01 +03:00
|
|
|
#ifdef CONFIG_F2FS_CHECK_FS
|
|
|
|
sit_i->sentries[start].cur_valid_map_mir
|
|
|
|
= kzalloc(SIT_VBLOCK_MAP_SIZE, GFP_KERNEL);
|
|
|
|
if (!sit_i->sentries[start].cur_valid_map_mir)
|
|
|
|
return -ENOMEM;
|
|
|
|
#endif
|
|
|
|
|
2016-08-02 20:56:40 +03:00
|
|
|
if (f2fs_discard_en(sbi)) {
|
|
|
|
sit_i->sentries[start].discard_map
|
|
|
|
= kzalloc(SIT_VBLOCK_MAP_SIZE, GFP_KERNEL);
|
|
|
|
if (!sit_i->sentries[start].discard_map)
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
2015-02-11 03:44:29 +03:00
|
|
|
sit_i->tmp_map = kzalloc(SIT_VBLOCK_MAP_SIZE, GFP_KERNEL);
|
|
|
|
if (!sit_i->tmp_map)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (sbi->segs_per_sec > 1) {
|
2015-09-22 23:50:47 +03:00
|
|
|
sit_i->sec_entries = f2fs_kvzalloc(MAIN_SECS(sbi) *
|
|
|
|
sizeof(struct sec_entry), GFP_KERNEL);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (!sit_i->sec_entries)
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* get information related with SIT */
|
|
|
|
sit_segs = le32_to_cpu(raw_super->segment_count_sit) >> 1;
|
|
|
|
|
|
|
|
/* setup SIT bitmap from ckeckpoint pack */
|
|
|
|
bitmap_size = __bitmap_size(sbi, SIT_BITMAP);
|
|
|
|
src_bitmap = __bitmap_ptr(sbi, SIT_BITMAP);
|
|
|
|
|
2013-03-28 04:24:53 +04:00
|
|
|
dst_bitmap = kmemdup(src_bitmap, bitmap_size, GFP_KERNEL);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (!dst_bitmap)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
/* init SIT information */
|
|
|
|
sit_i->s_ops = &default_salloc_ops;
|
|
|
|
|
|
|
|
sit_i->sit_base_addr = le32_to_cpu(raw_super->sit_blkaddr);
|
|
|
|
sit_i->sit_blocks = sit_segs << sbi->log_blocks_per_seg;
|
2016-11-15 05:20:10 +03:00
|
|
|
sit_i->written_valid_blocks = 0;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
sit_i->sit_bitmap = dst_bitmap;
|
|
|
|
sit_i->bitmap_size = bitmap_size;
|
|
|
|
sit_i->dirty_sentries = 0;
|
|
|
|
sit_i->sents_per_block = SIT_ENTRY_PER_BLOCK;
|
|
|
|
sit_i->elapsed_time = le64_to_cpu(sbi->ckpt->elapsed_time);
|
|
|
|
sit_i->mounted_time = CURRENT_TIME_SEC.tv_sec;
|
|
|
|
mutex_init(&sit_i->sentry_lock);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int build_free_segmap(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct free_segmap_info *free_i;
|
|
|
|
unsigned int bitmap_size, sec_bitmap_size;
|
|
|
|
|
|
|
|
/* allocate memory for free segmap information */
|
|
|
|
free_i = kzalloc(sizeof(struct free_segmap_info), GFP_KERNEL);
|
|
|
|
if (!free_i)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
SM_I(sbi)->free_info = free_i;
|
|
|
|
|
2014-09-23 22:23:01 +04:00
|
|
|
bitmap_size = f2fs_bitmap_size(MAIN_SEGS(sbi));
|
2015-09-22 23:50:47 +03:00
|
|
|
free_i->free_segmap = f2fs_kvmalloc(bitmap_size, GFP_KERNEL);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (!free_i->free_segmap)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
2014-09-23 22:23:01 +04:00
|
|
|
sec_bitmap_size = f2fs_bitmap_size(MAIN_SECS(sbi));
|
2015-09-22 23:50:47 +03:00
|
|
|
free_i->free_secmap = f2fs_kvmalloc(sec_bitmap_size, GFP_KERNEL);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (!free_i->free_secmap)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
/* set all segments as dirty temporarily */
|
|
|
|
memset(free_i->free_segmap, 0xff, bitmap_size);
|
|
|
|
memset(free_i->free_secmap, 0xff, sec_bitmap_size);
|
|
|
|
|
|
|
|
/* init free segmap information */
|
2014-09-23 22:23:01 +04:00
|
|
|
free_i->start_segno = GET_SEGNO_FROM_SEG0(sbi, MAIN_BLKADDR(sbi));
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
free_i->free_segments = 0;
|
|
|
|
free_i->free_sections = 0;
|
2015-02-11 13:20:38 +03:00
|
|
|
spin_lock_init(&free_i->segmap_lock);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int build_curseg(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
2012-12-01 05:56:13 +04:00
|
|
|
struct curseg_info *array;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
int i;
|
|
|
|
|
2014-06-23 20:39:15 +04:00
|
|
|
array = kcalloc(NR_CURSEG_TYPE, sizeof(*array), GFP_KERNEL);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (!array)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
SM_I(sbi)->curseg_array = array;
|
|
|
|
|
|
|
|
for (i = 0; i < NR_CURSEG_TYPE; i++) {
|
|
|
|
mutex_init(&array[i].curseg_mutex);
|
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.
This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.
Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.
Let's stop pretending that pages in page cache are special. They are
not.
The changes are pretty straight-forward:
- <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.
The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.
There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.
virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E
@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT
@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE
@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK
@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)
@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)
@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-04-01 15:29:47 +03:00
|
|
|
array[i].sum_blk = kzalloc(PAGE_SIZE, GFP_KERNEL);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (!array[i].sum_blk)
|
|
|
|
return -ENOMEM;
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
init_rwsem(&array[i].journal_rwsem);
|
|
|
|
array[i].journal = kzalloc(sizeof(struct f2fs_journal),
|
|
|
|
GFP_KERNEL);
|
|
|
|
if (!array[i].journal)
|
|
|
|
return -ENOMEM;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
array[i].segno = NULL_SEGNO;
|
|
|
|
array[i].next_blkoff = 0;
|
|
|
|
}
|
|
|
|
return restore_curseg_summaries(sbi);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void build_sit_entries(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct sit_info *sit_i = SIT_I(sbi);
|
|
|
|
struct curseg_info *curseg = CURSEG_I(sbi, CURSEG_COLD_DATA);
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
struct f2fs_journal *journal = curseg->journal;
|
2016-09-24 07:29:18 +03:00
|
|
|
struct seg_entry *se;
|
|
|
|
struct f2fs_sit_entry sit;
|
2013-11-22 05:09:59 +04:00
|
|
|
int sit_blk_cnt = SIT_BLK_CNT(sbi);
|
|
|
|
unsigned int i, start, end;
|
|
|
|
unsigned int readed, start_blk = 0;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
2013-11-22 05:09:59 +04:00
|
|
|
do {
|
2016-10-18 21:07:45 +03:00
|
|
|
readed = ra_meta_pages(sbi, start_blk, BIO_MAX_PAGES,
|
|
|
|
META_SIT, true);
|
2013-11-22 05:09:59 +04:00
|
|
|
|
|
|
|
start = start_blk * sit_i->sents_per_block;
|
|
|
|
end = (start_blk + readed) * sit_i->sents_per_block;
|
|
|
|
|
2014-09-23 22:23:01 +04:00
|
|
|
for (; start < end && start < MAIN_SEGS(sbi); start++) {
|
2013-11-22 05:09:59 +04:00
|
|
|
struct f2fs_sit_block *sit_blk;
|
|
|
|
struct page *page;
|
|
|
|
|
2016-09-24 07:29:18 +03:00
|
|
|
se = &sit_i->sentries[start];
|
2013-11-22 05:09:59 +04:00
|
|
|
page = get_current_sit_page(sbi, start);
|
|
|
|
sit_blk = (struct f2fs_sit_block *)page_address(page);
|
|
|
|
sit = sit_blk->entries[SIT_ENTRY_OFFSET(sit_i, start)];
|
|
|
|
f2fs_put_page(page, 1);
|
2016-08-19 18:13:47 +03:00
|
|
|
|
2013-11-22 05:09:59 +04:00
|
|
|
check_block_count(sbi, start, &sit);
|
|
|
|
seg_info_from_raw_sit(se, &sit);
|
2015-05-01 08:37:50 +03:00
|
|
|
|
|
|
|
/* build discard map only one time */
|
2016-08-02 20:56:40 +03:00
|
|
|
if (f2fs_discard_en(sbi)) {
|
|
|
|
memcpy(se->discard_map, se->cur_valid_map,
|
|
|
|
SIT_VBLOCK_MAP_SIZE);
|
|
|
|
sbi->discard_blks += sbi->blocks_per_seg -
|
|
|
|
se->valid_blocks;
|
|
|
|
}
|
2015-05-01 08:37:50 +03:00
|
|
|
|
2016-08-19 18:13:47 +03:00
|
|
|
if (sbi->segs_per_sec > 1)
|
|
|
|
get_sec_entry(sbi, start)->valid_blocks +=
|
|
|
|
se->valid_blocks;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
2013-11-22 05:09:59 +04:00
|
|
|
start_blk += readed;
|
|
|
|
} while (start_blk < sit_blk_cnt);
|
2016-08-19 18:13:47 +03:00
|
|
|
|
|
|
|
down_read(&curseg->journal_rwsem);
|
|
|
|
for (i = 0; i < sits_in_cursum(journal); i++) {
|
|
|
|
unsigned int old_valid_blocks;
|
|
|
|
|
|
|
|
start = le32_to_cpu(segno_in_journal(journal, i));
|
|
|
|
se = &sit_i->sentries[start];
|
|
|
|
sit = sit_in_journal(journal, i);
|
|
|
|
|
|
|
|
old_valid_blocks = se->valid_blocks;
|
|
|
|
|
|
|
|
check_block_count(sbi, start, &sit);
|
|
|
|
seg_info_from_raw_sit(se, &sit);
|
|
|
|
|
|
|
|
if (f2fs_discard_en(sbi)) {
|
|
|
|
memcpy(se->discard_map, se->cur_valid_map,
|
|
|
|
SIT_VBLOCK_MAP_SIZE);
|
|
|
|
sbi->discard_blks += old_valid_blocks -
|
|
|
|
se->valid_blocks;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sbi->segs_per_sec > 1)
|
|
|
|
get_sec_entry(sbi, start)->valid_blocks +=
|
|
|
|
se->valid_blocks - old_valid_blocks;
|
|
|
|
}
|
|
|
|
up_read(&curseg->journal_rwsem);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void init_free_segmap(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
unsigned int start;
|
|
|
|
int type;
|
|
|
|
|
2014-09-23 22:23:01 +04:00
|
|
|
for (start = 0; start < MAIN_SEGS(sbi); start++) {
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
struct seg_entry *sentry = get_seg_entry(sbi, start);
|
|
|
|
if (!sentry->valid_blocks)
|
|
|
|
__set_free(sbi, start);
|
2016-11-15 05:20:10 +03:00
|
|
|
else
|
|
|
|
SIT_I(sbi)->written_valid_blocks +=
|
|
|
|
sentry->valid_blocks;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
/* set use the current segments */
|
|
|
|
for (type = CURSEG_HOT_DATA; type <= CURSEG_COLD_NODE; type++) {
|
|
|
|
struct curseg_info *curseg_t = CURSEG_I(sbi, type);
|
|
|
|
__set_test_and_inuse(sbi, curseg_t->segno);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void init_dirty_segmap(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
|
|
|
|
struct free_segmap_info *free_i = FREE_I(sbi);
|
2014-09-23 22:23:01 +04:00
|
|
|
unsigned int segno = 0, offset = 0;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
unsigned short valid_blocks;
|
|
|
|
|
2013-06-16 04:49:11 +04:00
|
|
|
while (1) {
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
/* find dirty segment based on free segmap */
|
2014-09-23 22:23:01 +04:00
|
|
|
segno = find_next_inuse(free_i, MAIN_SEGS(sbi), offset);
|
|
|
|
if (segno >= MAIN_SEGS(sbi))
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
break;
|
|
|
|
offset = segno + 1;
|
|
|
|
valid_blocks = get_valid_blocks(sbi, segno, 0);
|
2014-09-03 03:24:11 +04:00
|
|
|
if (valid_blocks == sbi->blocks_per_seg || !valid_blocks)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
continue;
|
2014-09-03 03:24:11 +04:00
|
|
|
if (valid_blocks > sbi->blocks_per_seg) {
|
|
|
|
f2fs_bug_on(sbi, 1);
|
|
|
|
continue;
|
|
|
|
}
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
mutex_lock(&dirty_i->seglist_lock);
|
|
|
|
__locate_dirty_segment(sbi, segno, DIRTY);
|
|
|
|
mutex_unlock(&dirty_i->seglist_lock);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2013-03-31 08:26:03 +04:00
|
|
|
static int init_victim_secmap(struct f2fs_sb_info *sbi)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
|
|
|
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
|
2014-09-23 22:23:01 +04:00
|
|
|
unsigned int bitmap_size = f2fs_bitmap_size(MAIN_SECS(sbi));
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
2015-09-22 23:50:47 +03:00
|
|
|
dirty_i->victim_secmap = f2fs_kvzalloc(bitmap_size, GFP_KERNEL);
|
2013-03-31 08:26:03 +04:00
|
|
|
if (!dirty_i->victim_secmap)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
return -ENOMEM;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int build_dirty_segmap(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct dirty_seglist_info *dirty_i;
|
|
|
|
unsigned int bitmap_size, i;
|
|
|
|
|
|
|
|
/* allocate memory for dirty segments list information */
|
|
|
|
dirty_i = kzalloc(sizeof(struct dirty_seglist_info), GFP_KERNEL);
|
|
|
|
if (!dirty_i)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
SM_I(sbi)->dirty_info = dirty_i;
|
|
|
|
mutex_init(&dirty_i->seglist_lock);
|
|
|
|
|
2014-09-23 22:23:01 +04:00
|
|
|
bitmap_size = f2fs_bitmap_size(MAIN_SEGS(sbi));
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
for (i = 0; i < NR_DIRTY_TYPE; i++) {
|
2015-09-22 23:50:47 +03:00
|
|
|
dirty_i->dirty_segmap[i] = f2fs_kvzalloc(bitmap_size, GFP_KERNEL);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
if (!dirty_i->dirty_segmap[i])
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
|
|
|
init_dirty_segmap(sbi);
|
2013-03-31 08:26:03 +04:00
|
|
|
return init_victim_secmap(sbi);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
2012-11-29 08:28:09 +04:00
|
|
|
/*
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
* Update min, max modified time for cost-benefit GC algorithm
|
|
|
|
*/
|
|
|
|
static void init_min_max_mtime(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct sit_info *sit_i = SIT_I(sbi);
|
|
|
|
unsigned int segno;
|
|
|
|
|
|
|
|
mutex_lock(&sit_i->sentry_lock);
|
|
|
|
|
|
|
|
sit_i->min_mtime = LLONG_MAX;
|
|
|
|
|
2014-09-23 22:23:01 +04:00
|
|
|
for (segno = 0; segno < MAIN_SEGS(sbi); segno += sbi->segs_per_sec) {
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
unsigned int i;
|
|
|
|
unsigned long long mtime = 0;
|
|
|
|
|
|
|
|
for (i = 0; i < sbi->segs_per_sec; i++)
|
|
|
|
mtime += get_seg_entry(sbi, segno + i)->mtime;
|
|
|
|
|
|
|
|
mtime = div_u64(mtime, sbi->segs_per_sec);
|
|
|
|
|
|
|
|
if (sit_i->min_mtime > mtime)
|
|
|
|
sit_i->min_mtime = mtime;
|
|
|
|
}
|
|
|
|
sit_i->max_mtime = get_mtime(sbi);
|
|
|
|
mutex_unlock(&sit_i->sentry_lock);
|
|
|
|
}
|
|
|
|
|
|
|
|
int build_segment_manager(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct f2fs_super_block *raw_super = F2FS_RAW_SUPER(sbi);
|
|
|
|
struct f2fs_checkpoint *ckpt = F2FS_CKPT(sbi);
|
2012-12-01 05:56:13 +04:00
|
|
|
struct f2fs_sm_info *sm_info;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
int err;
|
|
|
|
|
|
|
|
sm_info = kzalloc(sizeof(struct f2fs_sm_info), GFP_KERNEL);
|
|
|
|
if (!sm_info)
|
|
|
|
return -ENOMEM;
|
|
|
|
|
|
|
|
/* init sm info */
|
|
|
|
sbi->sm_info = sm_info;
|
|
|
|
sm_info->seg0_blkaddr = le32_to_cpu(raw_super->segment0_blkaddr);
|
|
|
|
sm_info->main_blkaddr = le32_to_cpu(raw_super->main_blkaddr);
|
|
|
|
sm_info->segment_count = le32_to_cpu(raw_super->segment_count);
|
|
|
|
sm_info->reserved_segments = le32_to_cpu(ckpt->rsvd_segment_count);
|
|
|
|
sm_info->ovp_segments = le32_to_cpu(ckpt->overprov_segment_count);
|
|
|
|
sm_info->main_segments = le32_to_cpu(raw_super->segment_count_main);
|
|
|
|
sm_info->ssa_blkaddr = le32_to_cpu(raw_super->ssa_blkaddr);
|
2014-03-19 09:17:21 +04:00
|
|
|
sm_info->rec_prefree_segments = sm_info->main_segments *
|
|
|
|
DEF_RECLAIM_PREFREE_SEGMENTS / 100;
|
2016-07-14 04:23:35 +03:00
|
|
|
if (sm_info->rec_prefree_segments > DEF_MAX_RECLAIM_PREFREE_SEGMENTS)
|
|
|
|
sm_info->rec_prefree_segments = DEF_MAX_RECLAIM_PREFREE_SEGMENTS;
|
|
|
|
|
2016-06-13 19:47:48 +03:00
|
|
|
if (!test_opt(sbi, LFS))
|
|
|
|
sm_info->ipu_policy = 1 << F2FS_IPU_FSYNC;
|
2013-11-07 08:13:42 +04:00
|
|
|
sm_info->min_ipu_util = DEF_MIN_IPU_UTIL;
|
2014-09-11 03:53:02 +04:00
|
|
|
sm_info->min_fsync_blocks = DEF_MIN_FSYNC_BLOCKS;
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
2013-11-15 08:55:58 +04:00
|
|
|
INIT_LIST_HEAD(&sm_info->discard_list);
|
2016-08-29 18:58:34 +03:00
|
|
|
INIT_LIST_HEAD(&sm_info->wait_list);
|
2013-11-15 08:55:58 +04:00
|
|
|
sm_info->nr_discards = 0;
|
|
|
|
sm_info->max_discards = 0;
|
|
|
|
|
2015-01-27 04:41:23 +03:00
|
|
|
sm_info->trim_sections = DEF_BATCHED_TRIM_SECTIONS;
|
|
|
|
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
INIT_LIST_HEAD(&sm_info->sit_entry_set);
|
|
|
|
|
2014-04-11 13:49:55 +04:00
|
|
|
if (test_opt(sbi, FLUSH_MERGE) && !f2fs_readonly(sbi->sb)) {
|
2014-04-27 10:21:33 +04:00
|
|
|
err = create_flush_cmd_control(sbi);
|
|
|
|
if (err)
|
2014-04-27 10:21:21 +04:00
|
|
|
return err;
|
2014-04-02 10:34:36 +04:00
|
|
|
}
|
|
|
|
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
err = build_sit_info(sbi);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
err = build_free_segmap(sbi);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
err = build_curseg(sbi);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
/* reinit free segmap based on SIT */
|
|
|
|
build_sit_entries(sbi);
|
|
|
|
|
|
|
|
init_free_segmap(sbi);
|
|
|
|
err = build_dirty_segmap(sbi);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
init_min_max_mtime(sbi);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void discard_dirty_segmap(struct f2fs_sb_info *sbi,
|
|
|
|
enum dirty_type dirty_type)
|
|
|
|
{
|
|
|
|
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
|
|
|
|
|
|
|
|
mutex_lock(&dirty_i->seglist_lock);
|
2015-09-22 23:50:47 +03:00
|
|
|
kvfree(dirty_i->dirty_segmap[dirty_type]);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
dirty_i->nr_dirty[dirty_type] = 0;
|
|
|
|
mutex_unlock(&dirty_i->seglist_lock);
|
|
|
|
}
|
|
|
|
|
2013-03-31 08:26:03 +04:00
|
|
|
static void destroy_victim_secmap(struct f2fs_sb_info *sbi)
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
{
|
|
|
|
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
|
2015-09-22 23:50:47 +03:00
|
|
|
kvfree(dirty_i->victim_secmap);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
static void destroy_dirty_segmap(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct dirty_seglist_info *dirty_i = DIRTY_I(sbi);
|
|
|
|
int i;
|
|
|
|
|
|
|
|
if (!dirty_i)
|
|
|
|
return;
|
|
|
|
|
|
|
|
/* discard pre-free/dirty segments list */
|
|
|
|
for (i = 0; i < NR_DIRTY_TYPE; i++)
|
|
|
|
discard_dirty_segmap(sbi, i);
|
|
|
|
|
2013-03-31 08:26:03 +04:00
|
|
|
destroy_victim_secmap(sbi);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
SM_I(sbi)->dirty_info = NULL;
|
|
|
|
kfree(dirty_i);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void destroy_curseg(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct curseg_info *array = SM_I(sbi)->curseg_array;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
if (!array)
|
|
|
|
return;
|
|
|
|
SM_I(sbi)->curseg_array = NULL;
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
for (i = 0; i < NR_CURSEG_TYPE; i++) {
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
kfree(array[i].sum_blk);
|
f2fs: split journal cache from curseg cache
In curseg cache, f2fs caches two different parts:
- datas of current summay block, i.e. summary entries, footer info.
- journal info, i.e. sparse nat/sit entries or io stat info.
With this approach, 1) it may cause higher lock contention when we access
or update both of the parts of cache since we use the same mutex lock
curseg_mutex to protect the cache. 2) current summary block with last
journal info will be writebacked into device as a normal summary block
when flushing, however, we treat journal info as valid one only in current
summary, so most normal summary blocks contain junk journal data, it wastes
remaining space of summary block.
So, in order to fix above issues, we split curseg cache into two parts:
a) current summary block, protected by original mutex lock curseg_mutex
b) journal cache, protected by newly introduced r/w semaphore journal_rwsem
When loading curseg cache during ->mount, we store summary info and
journal info into different caches; When doing checkpoint, we combine
datas of two cache into current summary block for persisting.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-02-19 13:08:46 +03:00
|
|
|
kfree(array[i].journal);
|
|
|
|
}
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
kfree(array);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void destroy_free_segmap(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct free_segmap_info *free_i = SM_I(sbi)->free_info;
|
|
|
|
if (!free_i)
|
|
|
|
return;
|
|
|
|
SM_I(sbi)->free_info = NULL;
|
2015-09-22 23:50:47 +03:00
|
|
|
kvfree(free_i->free_segmap);
|
|
|
|
kvfree(free_i->free_secmap);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
kfree(free_i);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void destroy_sit_info(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct sit_info *sit_i = SIT_I(sbi);
|
|
|
|
unsigned int start;
|
|
|
|
|
|
|
|
if (!sit_i)
|
|
|
|
return;
|
|
|
|
|
|
|
|
if (sit_i->sentries) {
|
2014-09-23 22:23:01 +04:00
|
|
|
for (start = 0; start < MAIN_SEGS(sbi); start++) {
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
kfree(sit_i->sentries[start].cur_valid_map);
|
2017-01-07 13:51:01 +03:00
|
|
|
#ifdef CONFIG_F2FS_CHECK_FS
|
|
|
|
kfree(sit_i->sentries[start].cur_valid_map_mir);
|
|
|
|
#endif
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
kfree(sit_i->sentries[start].ckpt_valid_map);
|
2015-05-01 08:37:50 +03:00
|
|
|
kfree(sit_i->sentries[start].discard_map);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
}
|
|
|
|
}
|
2015-02-11 03:44:29 +03:00
|
|
|
kfree(sit_i->tmp_map);
|
|
|
|
|
2015-09-22 23:50:47 +03:00
|
|
|
kvfree(sit_i->sentries);
|
|
|
|
kvfree(sit_i->sec_entries);
|
|
|
|
kvfree(sit_i->dirty_sentries_bitmap);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
|
|
|
|
SM_I(sbi)->sit_info = NULL;
|
|
|
|
kfree(sit_i->sit_bitmap);
|
|
|
|
kfree(sit_i);
|
|
|
|
}
|
|
|
|
|
|
|
|
void destroy_segment_manager(struct f2fs_sb_info *sbi)
|
|
|
|
{
|
|
|
|
struct f2fs_sm_info *sm_info = SM_I(sbi);
|
2014-04-27 10:21:21 +04:00
|
|
|
|
2013-11-06 05:12:04 +04:00
|
|
|
if (!sm_info)
|
|
|
|
return;
|
2016-12-08 03:23:32 +03:00
|
|
|
destroy_flush_cmd_control(sbi, true);
|
f2fs: add segment operations
This adds specific functions not only to manage dirty/free segments, SIT pages,
a cache for SIT entries, and summary entries, but also to allocate free blocks
and write three types of pages: data, node, and meta.
- F2FS maintains three types of bitmaps in memory, which indicate free, prefree,
and dirty segments respectively.
- The key information of an SIT entry consists of a segment number, the number
of valid blocks in the segment, a bitmap to identify there-in valid or invalid
blocks.
- An SIT page is composed of a certain range of SIT entries, which is maintained
by the address space of meta_inode.
- To cache SIT entries, a simple array is used. The index for the array is the
segment number.
- A summary entry for data contains the parent node information. A summary entry
for node contains its node offset from the inode.
- F2FS manages information about six active logs and those summary entries in
memory. Whenever one of them is changed, its summary entries are flushed to
its SIT page maintained by the address space of meta_inode.
- This patch adds a default block allocation function which supports heap-based
allocation policy.
- This patch adds core functions to write data, node, and meta pages. Since LFS
basically produces a series of sequential writes, F2FS merges sequential bios
with a single one as much as possible to reduce the IO scheduling overhead.
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2012-11-02 12:09:16 +04:00
|
|
|
destroy_dirty_segmap(sbi);
|
|
|
|
destroy_curseg(sbi);
|
|
|
|
destroy_free_segmap(sbi);
|
|
|
|
destroy_sit_info(sbi);
|
|
|
|
sbi->sm_info = NULL;
|
|
|
|
kfree(sm_info);
|
|
|
|
}
|
2013-11-15 08:55:58 +04:00
|
|
|
|
|
|
|
int __init create_segment_manager_caches(void)
|
|
|
|
{
|
|
|
|
discard_entry_slab = f2fs_kmem_cache_create("discard_entry",
|
2014-03-07 14:43:28 +04:00
|
|
|
sizeof(struct discard_entry));
|
2013-11-15 08:55:58 +04:00
|
|
|
if (!discard_entry_slab)
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
goto fail;
|
|
|
|
|
2016-08-29 18:58:34 +03:00
|
|
|
bio_entry_slab = f2fs_kmem_cache_create("bio_entry",
|
|
|
|
sizeof(struct bio_entry));
|
|
|
|
if (!bio_entry_slab)
|
2016-09-05 07:28:26 +03:00
|
|
|
goto destroy_discard_entry;
|
2016-08-29 18:58:34 +03:00
|
|
|
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
sit_entry_set_slab = f2fs_kmem_cache_create("sit_entry_set",
|
2014-11-21 09:42:07 +03:00
|
|
|
sizeof(struct sit_entry_set));
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
if (!sit_entry_set_slab)
|
2016-08-29 18:58:34 +03:00
|
|
|
goto destroy_bio_entry;
|
2014-10-07 04:39:50 +04:00
|
|
|
|
|
|
|
inmem_entry_slab = f2fs_kmem_cache_create("inmem_page_entry",
|
|
|
|
sizeof(struct inmem_pages));
|
|
|
|
if (!inmem_entry_slab)
|
|
|
|
goto destroy_sit_entry_set;
|
2013-11-15 08:55:58 +04:00
|
|
|
return 0;
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
|
2014-10-07 04:39:50 +04:00
|
|
|
destroy_sit_entry_set:
|
|
|
|
kmem_cache_destroy(sit_entry_set_slab);
|
2016-08-29 18:58:34 +03:00
|
|
|
destroy_bio_entry:
|
|
|
|
kmem_cache_destroy(bio_entry_slab);
|
2016-09-05 07:28:26 +03:00
|
|
|
destroy_discard_entry:
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
kmem_cache_destroy(discard_entry_slab);
|
|
|
|
fail:
|
|
|
|
return -ENOMEM;
|
2013-11-15 08:55:58 +04:00
|
|
|
}
|
|
|
|
|
|
|
|
void destroy_segment_manager_caches(void)
|
|
|
|
{
|
f2fs: refactor flush_sit_entries codes for reducing SIT writes
In commit aec71382c681 ("f2fs: refactor flush_nat_entries codes for reducing NAT
writes"), we descripte the issue as below:
"Although building NAT journal in cursum reduce the read/write work for NAT
block, but previous design leave us lower performance when write checkpoint
frequently for these cases:
1. if journal in cursum has already full, it's a bit of waste that we flush all
nat entries to page for persistence, but not to cache any entries.
2. if journal in cursum is not full, we fill nat entries to journal util
journal is full, then flush the left dirty entries to disk without merge
journaled entries, so these journaled entries may be flushed to disk at next
checkpoint but lost chance to flushed last time."
Actually, we have the same problem in using SIT journal area.
In this patch, firstly we will update sit journal with dirty entries as many as
possible. Secondly if there is no space in sit journal, we will remove all
entries in journal and walk through the whole dirty entry bitmap of sit,
accounting dirty sit entries located in same SIT block to sit entry set. All
entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
by count of entries in set. Later we flush entries in set which have fewest
entries into journal as many as we can, and then flush dense set with merged
entries to disk.
In this way we can use sit journal area more effectively, also we will reduce
SIT update, result in gaining in performance and saving lifetime of flash
device.
In my testing environment, it shows this patch can help to reduce SIT block
update obviously.
virtual machine + hard disk:
fsstress -p 20 -n 400 -l 5
sit page num cp count sit pages/cp
based 2006.50 1349.75 1.486
patched 1566.25 1463.25 1.070
Our latency of merging op is small when handling a great number of dirty SIT
entries in flush_sit_entries:
latency(ns) dirty sit count
36038 2151
49168 2123
37174 2232
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2014-09-04 14:13:01 +04:00
|
|
|
kmem_cache_destroy(sit_entry_set_slab);
|
2016-08-29 18:58:34 +03:00
|
|
|
kmem_cache_destroy(bio_entry_slab);
|
2013-11-15 08:55:58 +04:00
|
|
|
kmem_cache_destroy(discard_entry_slab);
|
2014-10-07 04:39:50 +04:00
|
|
|
kmem_cache_destroy(inmem_entry_slab);
|
2013-11-15 08:55:58 +04:00
|
|
|
}
|