A set of device-mapper changes for 3.13.
Improve reliability of buffer allocations for dm messages with a small number of arguments, a couple path group initialization fixes for dm multipath, a fix for resizing a dm array, various fixes and optimizations for dm cache, a fix for device mapper's Kconfig menu indentation. Features added include: - dm crypt support for activating legacy CBC TrueCrypt containers (useful for forensics of these old TCRYPT containers) - reduced dm-cache memory requirements for each block in the cache - basic support for shrinking a dm-cache's cache (fast) device - most notably, dm-cache support for managing cache coherency when deploying dm-cache with sophisticated origin volumes (that support hardware snapshots and/or clustering): these changes come in the form of a new passthrough operation mode and a cache block invalidation interface. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAABAgAGBQJSgt+QAAoJEMUj8QotnQNapcEIALC6U1rmw08PRMSanqg4/aVu pTahzPtai9jXchQV6q5XsglJryrhD9MoNqrZgHd2drdnmEKTKfVX+/iCXGiE4hQ5 I5QUZf5myEXSd60pCgZwNam+VHMuAuSPQW6LWqRTJjDLHixGF+AoHZGxkEsYgj6M p686OOpga1nmT2w072xLIh9z2tsv/tm+UN7GSbyklM+/1ItcXxq+/J8rsuth7IqT k0I60jexq+Q3OaYuJY7vxhdE7PhBCw1fGmtuCcjekqsSVpAdCgDz3FFOEZmyXcUs YLFE3GcclYQpIPjNjVGTLDFHdoIMWdKiibs/ScBUtegqxWvqP7c87YFhbL+VHDM= =lLxo -----END PGP SIGNATURE----- Merge tag 'dm-3.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper changes from Mike Snitzer: "A set of device-mapper changes for 3.13. Improve reliability of buffer allocations for dm messages with a small number of arguments, a couple path group initialization fixes for dm multipath, a fix for resizing a dm array, various fixes and optimizations for dm cache, a fix for device mapper's Kconfig menu indentation. Features added include: - dm crypt support for activating legacy CBC TrueCrypt containers (useful for forensics of these old TCRYPT containers) - reduced dm-cache memory requirements for each block in the cache - basic support for shrinking a dm-cache's cache (fast) device - most notably, dm-cache support for managing cache coherency when deploying dm-cache with sophisticated origin volumes (that support hardware snapshots and/or clustering): these changes come in the form of a new passthrough operation mode and a cache block invalidation interface" * tag 'dm-3.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (32 commits) dm cache: resolve small nits and improve Documentation dm cache: add cache block invalidation support dm cache: add remove_cblock method to policy interface dm cache policy mq: reduce memory requirements dm cache metadata: check the metadata version when reading the superblock dm cache: add passthrough mode dm cache: cache shrinking support dm cache: promotion optimisation for writes dm cache: be much more aggressive about promoting writes to discarded blocks dm cache policy mq: implement writeback_work() and mq_{set,clear}_dirty() dm cache: optimize commit_if_needed dm space map disk: optimise sm_disk_dec_block MAINTAINERS: add reference to device-mapper's linux-dm.git tree dm: fix Kconfig menu indentation dm: allow remove to be deferred dm table: print error on preresume failure dm crypt: add TCW IV mode for old CBC TCRYPT containers dm crypt: properly handle extra key string in initialization dm cache: log error message if dm_kcopyd_copy() fails dm cache: use cell_defer() boolean argument consistently ...
This commit is contained in:
Коммит
7f2dc5c4bc
|
@ -30,8 +30,10 @@ multiqueue
|
|||
|
||||
This policy is the default.
|
||||
|
||||
The multiqueue policy has two sets of 16 queues: one set for entries
|
||||
waiting for the cache and another one for those in the cache.
|
||||
The multiqueue policy has three sets of 16 queues: one set for entries
|
||||
waiting for the cache and another two for those in the cache (a set for
|
||||
clean entries and a set for dirty entries).
|
||||
|
||||
Cache entries in the queues are aged based on logical time. Entry into
|
||||
the cache is based on variable thresholds and queue selection is based
|
||||
on hit count on entry. The policy aims to take different cache miss
|
||||
|
|
|
@ -68,10 +68,11 @@ So large block sizes are bad because they waste cache space. And small
|
|||
block sizes are bad because they increase the amount of metadata (both
|
||||
in core and on disk).
|
||||
|
||||
Writeback/writethrough
|
||||
----------------------
|
||||
Cache operating modes
|
||||
---------------------
|
||||
|
||||
The cache has two modes, writeback and writethrough.
|
||||
The cache has three operating modes: writeback, writethrough and
|
||||
passthrough.
|
||||
|
||||
If writeback, the default, is selected then a write to a block that is
|
||||
cached will go only to the cache and the block will be marked dirty in
|
||||
|
@ -81,8 +82,31 @@ If writethrough is selected then a write to a cached block will not
|
|||
complete until it has hit both the origin and cache devices. Clean
|
||||
blocks should remain clean.
|
||||
|
||||
If passthrough is selected, useful when the cache contents are not known
|
||||
to be coherent with the origin device, then all reads are served from
|
||||
the origin device (all reads miss the cache) and all writes are
|
||||
forwarded to the origin device; additionally, write hits cause cache
|
||||
block invalidates. To enable passthrough mode the cache must be clean.
|
||||
Passthrough mode allows a cache device to be activated without having to
|
||||
worry about coherency. Coherency that exists is maintained, although
|
||||
the cache will gradually cool as writes take place. If the coherency of
|
||||
the cache can later be verified, or established through use of the
|
||||
"invalidate_cblocks" message, the cache device can be transitioned to
|
||||
writethrough or writeback mode while still warm. Otherwise, the cache
|
||||
contents can be discarded prior to transitioning to the desired
|
||||
operating mode.
|
||||
|
||||
A simple cleaner policy is provided, which will clean (write back) all
|
||||
dirty blocks in a cache. Useful for decommissioning a cache.
|
||||
dirty blocks in a cache. Useful for decommissioning a cache or when
|
||||
shrinking a cache. Shrinking the cache's fast device requires all cache
|
||||
blocks, in the area of the cache being removed, to be clean. If the
|
||||
area being removed from the cache still contains dirty blocks the resize
|
||||
will fail. Care must be taken to never reduce the volume used for the
|
||||
cache's fast device until the cache is clean. This is of particular
|
||||
importance if writeback mode is used. Writethrough and passthrough
|
||||
modes already maintain a clean cache. Future support to partially clean
|
||||
the cache, above a specified threshold, will allow for keeping the cache
|
||||
warm and in writeback mode during resize.
|
||||
|
||||
Migration throttling
|
||||
--------------------
|
||||
|
@ -161,7 +185,7 @@ Constructor
|
|||
block size : cache unit size in sectors
|
||||
|
||||
#feature args : number of feature arguments passed
|
||||
feature args : writethrough. (The default is writeback.)
|
||||
feature args : writethrough or passthrough (The default is writeback.)
|
||||
|
||||
policy : the replacement policy to use
|
||||
#policy args : an even number of arguments corresponding to
|
||||
|
@ -177,6 +201,13 @@ Optional feature arguments are:
|
|||
back cache block contents later for performance reasons,
|
||||
so they may differ from the corresponding origin blocks.
|
||||
|
||||
passthrough : a degraded mode useful for various cache coherency
|
||||
situations (e.g., rolling back snapshots of
|
||||
underlying storage). Reads and writes always go to
|
||||
the origin. If a write goes to a cached origin
|
||||
block, then the cache block is invalidated.
|
||||
To enable passthrough mode the cache must be clean.
|
||||
|
||||
A policy called 'default' is always registered. This is an alias for
|
||||
the policy we currently think is giving best all round performance.
|
||||
|
||||
|
@ -231,12 +262,26 @@ The message format is:
|
|||
E.g.
|
||||
dmsetup message my_cache 0 sequential_threshold 1024
|
||||
|
||||
|
||||
Invalidation is removing an entry from the cache without writing it
|
||||
back. Cache blocks can be invalidated via the invalidate_cblocks
|
||||
message, which takes an arbitrary number of cblock ranges. Each cblock
|
||||
must be expressed as a decimal value, in the future a variant message
|
||||
that takes cblock ranges expressed in hexidecimal may be needed to
|
||||
better support efficient invalidation of larger caches. The cache must
|
||||
be in passthrough mode when invalidate_cblocks is used.
|
||||
|
||||
invalidate_cblocks [<cblock>|<cblock begin>-<cblock end>]*
|
||||
|
||||
E.g.
|
||||
dmsetup message my_cache 0 invalidate_cblocks 2345 3456-4567 5678-6789
|
||||
|
||||
Examples
|
||||
========
|
||||
|
||||
The test suite can be found here:
|
||||
|
||||
https://github.com/jthornber/thinp-test-suite
|
||||
https://github.com/jthornber/device-mapper-test-suite
|
||||
|
||||
dmsetup create my_cache --table '0 41943040 cache /dev/mapper/metadata \
|
||||
/dev/mapper/ssd /dev/mapper/origin 512 1 writeback default 0'
|
||||
|
|
|
@ -4,12 +4,15 @@ dm-crypt
|
|||
Device-Mapper's "crypt" target provides transparent encryption of block devices
|
||||
using the kernel crypto API.
|
||||
|
||||
For a more detailed description of supported parameters see:
|
||||
http://code.google.com/p/cryptsetup/wiki/DMCrypt
|
||||
|
||||
Parameters: <cipher> <key> <iv_offset> <device path> \
|
||||
<offset> [<#opt_params> <opt_params>]
|
||||
|
||||
<cipher>
|
||||
Encryption cipher and an optional IV generation mode.
|
||||
(In format cipher[:keycount]-chainmode-ivopts:ivmode).
|
||||
(In format cipher[:keycount]-chainmode-ivmode[:ivopts]).
|
||||
Examples:
|
||||
des
|
||||
aes-cbc-essiv:sha256
|
||||
|
@ -19,7 +22,11 @@ Parameters: <cipher> <key> <iv_offset> <device path> \
|
|||
|
||||
<key>
|
||||
Key used for encryption. It is encoded as a hexadecimal number.
|
||||
You can only use key sizes that are valid for the selected cipher.
|
||||
You can only use key sizes that are valid for the selected cipher
|
||||
in combination with the selected iv mode.
|
||||
Note that for some iv modes the key string can contain additional
|
||||
keys (for example IV seed) so the key contains more parts concatenated
|
||||
into a single string.
|
||||
|
||||
<keycount>
|
||||
Multi-key compatibility mode. You can define <keycount> keys and
|
||||
|
|
|
@ -2647,6 +2647,7 @@ M: dm-devel@redhat.com
|
|||
L: dm-devel@redhat.com
|
||||
W: http://sources.redhat.com/dm
|
||||
Q: http://patchwork.kernel.org/project/dm-devel/list/
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git
|
||||
T: quilt http://people.redhat.com/agk/patches/linux/editing/
|
||||
S: Maintained
|
||||
F: Documentation/device-mapper/
|
||||
|
|
|
@ -297,6 +297,17 @@ config DM_MIRROR
|
|||
Allow volume managers to mirror logical volumes, also
|
||||
needed for live data migration tools such as 'pvmove'.
|
||||
|
||||
config DM_LOG_USERSPACE
|
||||
tristate "Mirror userspace logging"
|
||||
depends on DM_MIRROR && NET
|
||||
select CONNECTOR
|
||||
---help---
|
||||
The userspace logging module provides a mechanism for
|
||||
relaying the dm-dirty-log API to userspace. Log designs
|
||||
which are more suited to userspace implementation (e.g.
|
||||
shared storage logs) or experimental logs can be implemented
|
||||
by leveraging this framework.
|
||||
|
||||
config DM_RAID
|
||||
tristate "RAID 1/4/5/6/10 target"
|
||||
depends on BLK_DEV_DM
|
||||
|
@ -323,17 +334,6 @@ config DM_RAID
|
|||
RAID-5, RAID-6 distributes the syndromes across the drives
|
||||
in one of the available parity distribution methods.
|
||||
|
||||
config DM_LOG_USERSPACE
|
||||
tristate "Mirror userspace logging"
|
||||
depends on DM_MIRROR && NET
|
||||
select CONNECTOR
|
||||
---help---
|
||||
The userspace logging module provides a mechanism for
|
||||
relaying the dm-dirty-log API to userspace. Log designs
|
||||
which are more suited to userspace implementation (e.g.
|
||||
shared storage logs) or experimental logs can be implemented
|
||||
by leveraging this framework.
|
||||
|
||||
config DM_ZERO
|
||||
tristate "Zero target"
|
||||
depends on BLK_DEV_DM
|
||||
|
|
|
@ -20,7 +20,13 @@
|
|||
|
||||
#define CACHE_SUPERBLOCK_MAGIC 06142003
|
||||
#define CACHE_SUPERBLOCK_LOCATION 0
|
||||
#define CACHE_VERSION 1
|
||||
|
||||
/*
|
||||
* defines a range of metadata versions that this module can handle.
|
||||
*/
|
||||
#define MIN_CACHE_VERSION 1
|
||||
#define MAX_CACHE_VERSION 1
|
||||
|
||||
#define CACHE_METADATA_CACHE_SIZE 64
|
||||
|
||||
/*
|
||||
|
@ -134,6 +140,18 @@ static void sb_prepare_for_write(struct dm_block_validator *v,
|
|||
SUPERBLOCK_CSUM_XOR));
|
||||
}
|
||||
|
||||
static int check_metadata_version(struct cache_disk_superblock *disk_super)
|
||||
{
|
||||
uint32_t metadata_version = le32_to_cpu(disk_super->version);
|
||||
if (metadata_version < MIN_CACHE_VERSION || metadata_version > MAX_CACHE_VERSION) {
|
||||
DMERR("Cache metadata version %u found, but only versions between %u and %u supported.",
|
||||
metadata_version, MIN_CACHE_VERSION, MAX_CACHE_VERSION);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int sb_check(struct dm_block_validator *v,
|
||||
struct dm_block *b,
|
||||
size_t sb_block_size)
|
||||
|
@ -164,7 +182,7 @@ static int sb_check(struct dm_block_validator *v,
|
|||
return -EILSEQ;
|
||||
}
|
||||
|
||||
return 0;
|
||||
return check_metadata_version(disk_super);
|
||||
}
|
||||
|
||||
static struct dm_block_validator sb_validator = {
|
||||
|
@ -198,7 +216,7 @@ static int superblock_lock(struct dm_cache_metadata *cmd,
|
|||
|
||||
/*----------------------------------------------------------------*/
|
||||
|
||||
static int __superblock_all_zeroes(struct dm_block_manager *bm, int *result)
|
||||
static int __superblock_all_zeroes(struct dm_block_manager *bm, bool *result)
|
||||
{
|
||||
int r;
|
||||
unsigned i;
|
||||
|
@ -214,10 +232,10 @@ static int __superblock_all_zeroes(struct dm_block_manager *bm, int *result)
|
|||
return r;
|
||||
|
||||
data_le = dm_block_data(b);
|
||||
*result = 1;
|
||||
*result = true;
|
||||
for (i = 0; i < sb_block_size; i++) {
|
||||
if (data_le[i] != zero) {
|
||||
*result = 0;
|
||||
*result = false;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
@ -270,7 +288,7 @@ static int __write_initial_superblock(struct dm_cache_metadata *cmd)
|
|||
disk_super->flags = 0;
|
||||
memset(disk_super->uuid, 0, sizeof(disk_super->uuid));
|
||||
disk_super->magic = cpu_to_le64(CACHE_SUPERBLOCK_MAGIC);
|
||||
disk_super->version = cpu_to_le32(CACHE_VERSION);
|
||||
disk_super->version = cpu_to_le32(MAX_CACHE_VERSION);
|
||||
memset(disk_super->policy_name, 0, sizeof(disk_super->policy_name));
|
||||
memset(disk_super->policy_version, 0, sizeof(disk_super->policy_version));
|
||||
disk_super->policy_hint_size = 0;
|
||||
|
@ -411,7 +429,8 @@ bad:
|
|||
static int __open_or_format_metadata(struct dm_cache_metadata *cmd,
|
||||
bool format_device)
|
||||
{
|
||||
int r, unformatted;
|
||||
int r;
|
||||
bool unformatted = false;
|
||||
|
||||
r = __superblock_all_zeroes(cmd->bm, &unformatted);
|
||||
if (r)
|
||||
|
@ -666,19 +685,85 @@ void dm_cache_metadata_close(struct dm_cache_metadata *cmd)
|
|||
kfree(cmd);
|
||||
}
|
||||
|
||||
/*
|
||||
* Checks that the given cache block is either unmapped or clean.
|
||||
*/
|
||||
static int block_unmapped_or_clean(struct dm_cache_metadata *cmd, dm_cblock_t b,
|
||||
bool *result)
|
||||
{
|
||||
int r;
|
||||
__le64 value;
|
||||
dm_oblock_t ob;
|
||||
unsigned flags;
|
||||
|
||||
r = dm_array_get_value(&cmd->info, cmd->root, from_cblock(b), &value);
|
||||
if (r) {
|
||||
DMERR("block_unmapped_or_clean failed");
|
||||
return r;
|
||||
}
|
||||
|
||||
unpack_value(value, &ob, &flags);
|
||||
*result = !((flags & M_VALID) && (flags & M_DIRTY));
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int blocks_are_unmapped_or_clean(struct dm_cache_metadata *cmd,
|
||||
dm_cblock_t begin, dm_cblock_t end,
|
||||
bool *result)
|
||||
{
|
||||
int r;
|
||||
*result = true;
|
||||
|
||||
while (begin != end) {
|
||||
r = block_unmapped_or_clean(cmd, begin, result);
|
||||
if (r)
|
||||
return r;
|
||||
|
||||
if (!*result) {
|
||||
DMERR("cache block %llu is dirty",
|
||||
(unsigned long long) from_cblock(begin));
|
||||
return 0;
|
||||
}
|
||||
|
||||
begin = to_cblock(from_cblock(begin) + 1);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
int dm_cache_resize(struct dm_cache_metadata *cmd, dm_cblock_t new_cache_size)
|
||||
{
|
||||
int r;
|
||||
bool clean;
|
||||
__le64 null_mapping = pack_value(0, 0);
|
||||
|
||||
down_write(&cmd->root_lock);
|
||||
__dm_bless_for_disk(&null_mapping);
|
||||
|
||||
if (from_cblock(new_cache_size) < from_cblock(cmd->cache_blocks)) {
|
||||
r = blocks_are_unmapped_or_clean(cmd, new_cache_size, cmd->cache_blocks, &clean);
|
||||
if (r) {
|
||||
__dm_unbless_for_disk(&null_mapping);
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (!clean) {
|
||||
DMERR("unable to shrink cache due to dirty blocks");
|
||||
r = -EINVAL;
|
||||
__dm_unbless_for_disk(&null_mapping);
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
r = dm_array_resize(&cmd->info, cmd->root, from_cblock(cmd->cache_blocks),
|
||||
from_cblock(new_cache_size),
|
||||
&null_mapping, &cmd->root);
|
||||
if (!r)
|
||||
cmd->cache_blocks = new_cache_size;
|
||||
cmd->changed = true;
|
||||
|
||||
out:
|
||||
up_write(&cmd->root_lock);
|
||||
|
||||
return r;
|
||||
|
@ -1182,3 +1267,8 @@ int dm_cache_save_hint(struct dm_cache_metadata *cmd, dm_cblock_t cblock,
|
|||
|
||||
return r;
|
||||
}
|
||||
|
||||
int dm_cache_metadata_all_clean(struct dm_cache_metadata *cmd, bool *result)
|
||||
{
|
||||
return blocks_are_unmapped_or_clean(cmd, 0, cmd->cache_blocks, result);
|
||||
}
|
||||
|
|
|
@ -137,6 +137,11 @@ int dm_cache_begin_hints(struct dm_cache_metadata *cmd, struct dm_cache_policy *
|
|||
int dm_cache_save_hint(struct dm_cache_metadata *cmd,
|
||||
dm_cblock_t cblock, uint32_t hint);
|
||||
|
||||
/*
|
||||
* Query method. Are all the blocks in the cache clean?
|
||||
*/
|
||||
int dm_cache_metadata_all_clean(struct dm_cache_metadata *cmd, bool *result);
|
||||
|
||||
/*----------------------------------------------------------------*/
|
||||
|
||||
#endif /* DM_CACHE_METADATA_H */
|
||||
|
|
|
@ -61,7 +61,12 @@ static inline int policy_writeback_work(struct dm_cache_policy *p,
|
|||
|
||||
static inline void policy_remove_mapping(struct dm_cache_policy *p, dm_oblock_t oblock)
|
||||
{
|
||||
return p->remove_mapping(p, oblock);
|
||||
p->remove_mapping(p, oblock);
|
||||
}
|
||||
|
||||
static inline int policy_remove_cblock(struct dm_cache_policy *p, dm_cblock_t cblock)
|
||||
{
|
||||
return p->remove_cblock(p, cblock);
|
||||
}
|
||||
|
||||
static inline void policy_force_mapping(struct dm_cache_policy *p,
|
||||
|
|
Разница между файлами не показана из-за своего большого размера
Загрузить разницу
|
@ -119,13 +119,13 @@ struct dm_cache_policy *dm_cache_policy_create(const char *name,
|
|||
type = get_policy(name);
|
||||
if (!type) {
|
||||
DMWARN("unknown policy type");
|
||||
return NULL;
|
||||
return ERR_PTR(-EINVAL);
|
||||
}
|
||||
|
||||
p = type->create(cache_size, origin_size, cache_block_size);
|
||||
if (!p) {
|
||||
put_policy(type);
|
||||
return NULL;
|
||||
return ERR_PTR(-ENOMEM);
|
||||
}
|
||||
p->private = type;
|
||||
|
||||
|
|
|
@ -135,9 +135,6 @@ struct dm_cache_policy {
|
|||
*/
|
||||
int (*lookup)(struct dm_cache_policy *p, dm_oblock_t oblock, dm_cblock_t *cblock);
|
||||
|
||||
/*
|
||||
* oblock must be a mapped block. Must not block.
|
||||
*/
|
||||
void (*set_dirty)(struct dm_cache_policy *p, dm_oblock_t oblock);
|
||||
void (*clear_dirty)(struct dm_cache_policy *p, dm_oblock_t oblock);
|
||||
|
||||
|
@ -159,8 +156,24 @@ struct dm_cache_policy {
|
|||
void (*force_mapping)(struct dm_cache_policy *p, dm_oblock_t current_oblock,
|
||||
dm_oblock_t new_oblock);
|
||||
|
||||
int (*writeback_work)(struct dm_cache_policy *p, dm_oblock_t *oblock, dm_cblock_t *cblock);
|
||||
/*
|
||||
* This is called via the invalidate_cblocks message. It is
|
||||
* possible the particular cblock has already been removed due to a
|
||||
* write io in passthrough mode. In which case this should return
|
||||
* -ENODATA.
|
||||
*/
|
||||
int (*remove_cblock)(struct dm_cache_policy *p, dm_cblock_t cblock);
|
||||
|
||||
/*
|
||||
* Provide a dirty block to be written back by the core target.
|
||||
*
|
||||
* Returns:
|
||||
*
|
||||
* 0 and @cblock,@oblock: block to write back provided
|
||||
*
|
||||
* -ENODATA: no dirty blocks available
|
||||
*/
|
||||
int (*writeback_work)(struct dm_cache_policy *p, dm_oblock_t *oblock, dm_cblock_t *cblock);
|
||||
|
||||
/*
|
||||
* How full is the cache?
|
||||
|
|
Разница между файлами не показана из-за своего большого размера
Загрузить разницу
|
@ -2,6 +2,7 @@
|
|||
* Copyright (C) 2003 Christophe Saout <christophe@saout.de>
|
||||
* Copyright (C) 2004 Clemens Fruhwirth <clemens@endorphin.org>
|
||||
* Copyright (C) 2006-2009 Red Hat, Inc. All rights reserved.
|
||||
* Copyright (C) 2013 Milan Broz <gmazyland@gmail.com>
|
||||
*
|
||||
* This file is released under the GPL.
|
||||
*/
|
||||
|
@ -98,6 +99,13 @@ struct iv_lmk_private {
|
|||
u8 *seed;
|
||||
};
|
||||
|
||||
#define TCW_WHITENING_SIZE 16
|
||||
struct iv_tcw_private {
|
||||
struct crypto_shash *crc32_tfm;
|
||||
u8 *iv_seed;
|
||||
u8 *whitening;
|
||||
};
|
||||
|
||||
/*
|
||||
* Crypt: maps a linear range of a block device
|
||||
* and encrypts / decrypts at the same time.
|
||||
|
@ -139,6 +147,7 @@ struct crypt_config {
|
|||
struct iv_essiv_private essiv;
|
||||
struct iv_benbi_private benbi;
|
||||
struct iv_lmk_private lmk;
|
||||
struct iv_tcw_private tcw;
|
||||
} iv_gen_private;
|
||||
sector_t iv_offset;
|
||||
unsigned int iv_size;
|
||||
|
@ -171,7 +180,8 @@ struct crypt_config {
|
|||
|
||||
unsigned long flags;
|
||||
unsigned int key_size;
|
||||
unsigned int key_parts;
|
||||
unsigned int key_parts; /* independent parts in key buffer */
|
||||
unsigned int key_extra_size; /* additional keys length */
|
||||
u8 key[0];
|
||||
};
|
||||
|
||||
|
@ -230,6 +240,16 @@ static struct crypto_ablkcipher *any_tfm(struct crypt_config *cc)
|
|||
* version 3: the same as version 2 with additional IV seed
|
||||
* (it uses 65 keys, last key is used as IV seed)
|
||||
*
|
||||
* tcw: Compatible implementation of the block chaining mode used
|
||||
* by the TrueCrypt device encryption system (prior to version 4.1).
|
||||
* For more info see: http://www.truecrypt.org
|
||||
* It operates on full 512 byte sectors and uses CBC
|
||||
* with an IV derived from initial key and the sector number.
|
||||
* In addition, whitening value is applied on every sector, whitening
|
||||
* is calculated from initial key, sector number and mixed using CRC32.
|
||||
* Note that this encryption scheme is vulnerable to watermarking attacks
|
||||
* and should be used for old compatible containers access only.
|
||||
*
|
||||
* plumb: unimplemented, see:
|
||||
* http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/454
|
||||
*/
|
||||
|
@ -530,7 +550,7 @@ static int crypt_iv_lmk_one(struct crypt_config *cc, u8 *iv,
|
|||
char ctx[crypto_shash_descsize(lmk->hash_tfm)];
|
||||
} sdesc;
|
||||
struct md5_state md5state;
|
||||
u32 buf[4];
|
||||
__le32 buf[4];
|
||||
int i, r;
|
||||
|
||||
sdesc.desc.tfm = lmk->hash_tfm;
|
||||
|
@ -608,6 +628,153 @@ static int crypt_iv_lmk_post(struct crypt_config *cc, u8 *iv,
|
|||
return r;
|
||||
}
|
||||
|
||||
static void crypt_iv_tcw_dtr(struct crypt_config *cc)
|
||||
{
|
||||
struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
|
||||
|
||||
kzfree(tcw->iv_seed);
|
||||
tcw->iv_seed = NULL;
|
||||
kzfree(tcw->whitening);
|
||||
tcw->whitening = NULL;
|
||||
|
||||
if (tcw->crc32_tfm && !IS_ERR(tcw->crc32_tfm))
|
||||
crypto_free_shash(tcw->crc32_tfm);
|
||||
tcw->crc32_tfm = NULL;
|
||||
}
|
||||
|
||||
static int crypt_iv_tcw_ctr(struct crypt_config *cc, struct dm_target *ti,
|
||||
const char *opts)
|
||||
{
|
||||
struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
|
||||
|
||||
if (cc->key_size <= (cc->iv_size + TCW_WHITENING_SIZE)) {
|
||||
ti->error = "Wrong key size for TCW";
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
tcw->crc32_tfm = crypto_alloc_shash("crc32", 0, 0);
|
||||
if (IS_ERR(tcw->crc32_tfm)) {
|
||||
ti->error = "Error initializing CRC32 in TCW";
|
||||
return PTR_ERR(tcw->crc32_tfm);
|
||||
}
|
||||
|
||||
tcw->iv_seed = kzalloc(cc->iv_size, GFP_KERNEL);
|
||||
tcw->whitening = kzalloc(TCW_WHITENING_SIZE, GFP_KERNEL);
|
||||
if (!tcw->iv_seed || !tcw->whitening) {
|
||||
crypt_iv_tcw_dtr(cc);
|
||||
ti->error = "Error allocating seed storage in TCW";
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int crypt_iv_tcw_init(struct crypt_config *cc)
|
||||
{
|
||||
struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
|
||||
int key_offset = cc->key_size - cc->iv_size - TCW_WHITENING_SIZE;
|
||||
|
||||
memcpy(tcw->iv_seed, &cc->key[key_offset], cc->iv_size);
|
||||
memcpy(tcw->whitening, &cc->key[key_offset + cc->iv_size],
|
||||
TCW_WHITENING_SIZE);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int crypt_iv_tcw_wipe(struct crypt_config *cc)
|
||||
{
|
||||
struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
|
||||
|
||||
memset(tcw->iv_seed, 0, cc->iv_size);
|
||||
memset(tcw->whitening, 0, TCW_WHITENING_SIZE);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int crypt_iv_tcw_whitening(struct crypt_config *cc,
|
||||
struct dm_crypt_request *dmreq,
|
||||
u8 *data)
|
||||
{
|
||||
struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
|
||||
u64 sector = cpu_to_le64((u64)dmreq->iv_sector);
|
||||
u8 buf[TCW_WHITENING_SIZE];
|
||||
struct {
|
||||
struct shash_desc desc;
|
||||
char ctx[crypto_shash_descsize(tcw->crc32_tfm)];
|
||||
} sdesc;
|
||||
int i, r;
|
||||
|
||||
/* xor whitening with sector number */
|
||||
memcpy(buf, tcw->whitening, TCW_WHITENING_SIZE);
|
||||
crypto_xor(buf, (u8 *)§or, 8);
|
||||
crypto_xor(&buf[8], (u8 *)§or, 8);
|
||||
|
||||
/* calculate crc32 for every 32bit part and xor it */
|
||||
sdesc.desc.tfm = tcw->crc32_tfm;
|
||||
sdesc.desc.flags = CRYPTO_TFM_REQ_MAY_SLEEP;
|
||||
for (i = 0; i < 4; i++) {
|
||||
r = crypto_shash_init(&sdesc.desc);
|
||||
if (r)
|
||||
goto out;
|
||||
r = crypto_shash_update(&sdesc.desc, &buf[i * 4], 4);
|
||||
if (r)
|
||||
goto out;
|
||||
r = crypto_shash_final(&sdesc.desc, &buf[i * 4]);
|
||||
if (r)
|
||||
goto out;
|
||||
}
|
||||
crypto_xor(&buf[0], &buf[12], 4);
|
||||
crypto_xor(&buf[4], &buf[8], 4);
|
||||
|
||||
/* apply whitening (8 bytes) to whole sector */
|
||||
for (i = 0; i < ((1 << SECTOR_SHIFT) / 8); i++)
|
||||
crypto_xor(data + i * 8, buf, 8);
|
||||
out:
|
||||
memset(buf, 0, sizeof(buf));
|
||||
return r;
|
||||
}
|
||||
|
||||
static int crypt_iv_tcw_gen(struct crypt_config *cc, u8 *iv,
|
||||
struct dm_crypt_request *dmreq)
|
||||
{
|
||||
struct iv_tcw_private *tcw = &cc->iv_gen_private.tcw;
|
||||
u64 sector = cpu_to_le64((u64)dmreq->iv_sector);
|
||||
u8 *src;
|
||||
int r = 0;
|
||||
|
||||
/* Remove whitening from ciphertext */
|
||||
if (bio_data_dir(dmreq->ctx->bio_in) != WRITE) {
|
||||
src = kmap_atomic(sg_page(&dmreq->sg_in));
|
||||
r = crypt_iv_tcw_whitening(cc, dmreq, src + dmreq->sg_in.offset);
|
||||
kunmap_atomic(src);
|
||||
}
|
||||
|
||||
/* Calculate IV */
|
||||
memcpy(iv, tcw->iv_seed, cc->iv_size);
|
||||
crypto_xor(iv, (u8 *)§or, 8);
|
||||
if (cc->iv_size > 8)
|
||||
crypto_xor(&iv[8], (u8 *)§or, cc->iv_size - 8);
|
||||
|
||||
return r;
|
||||
}
|
||||
|
||||
static int crypt_iv_tcw_post(struct crypt_config *cc, u8 *iv,
|
||||
struct dm_crypt_request *dmreq)
|
||||
{
|
||||
u8 *dst;
|
||||
int r;
|
||||
|
||||
if (bio_data_dir(dmreq->ctx->bio_in) != WRITE)
|
||||
return 0;
|
||||
|
||||
/* Apply whitening on ciphertext */
|
||||
dst = kmap_atomic(sg_page(&dmreq->sg_out));
|
||||
r = crypt_iv_tcw_whitening(cc, dmreq, dst + dmreq->sg_out.offset);
|
||||
kunmap_atomic(dst);
|
||||
|
||||
return r;
|
||||
}
|
||||
|
||||
static struct crypt_iv_operations crypt_iv_plain_ops = {
|
||||
.generator = crypt_iv_plain_gen
|
||||
};
|
||||
|
@ -643,6 +810,15 @@ static struct crypt_iv_operations crypt_iv_lmk_ops = {
|
|||
.post = crypt_iv_lmk_post
|
||||
};
|
||||
|
||||
static struct crypt_iv_operations crypt_iv_tcw_ops = {
|
||||
.ctr = crypt_iv_tcw_ctr,
|
||||
.dtr = crypt_iv_tcw_dtr,
|
||||
.init = crypt_iv_tcw_init,
|
||||
.wipe = crypt_iv_tcw_wipe,
|
||||
.generator = crypt_iv_tcw_gen,
|
||||
.post = crypt_iv_tcw_post
|
||||
};
|
||||
|
||||
static void crypt_convert_init(struct crypt_config *cc,
|
||||
struct convert_context *ctx,
|
||||
struct bio *bio_out, struct bio *bio_in,
|
||||
|
@ -1274,9 +1450,12 @@ static int crypt_alloc_tfms(struct crypt_config *cc, char *ciphermode)
|
|||
|
||||
static int crypt_setkey_allcpus(struct crypt_config *cc)
|
||||
{
|
||||
unsigned subkey_size = cc->key_size >> ilog2(cc->tfms_count);
|
||||
unsigned subkey_size;
|
||||
int err = 0, i, r;
|
||||
|
||||
/* Ignore extra keys (which are used for IV etc) */
|
||||
subkey_size = (cc->key_size - cc->key_extra_size) >> ilog2(cc->tfms_count);
|
||||
|
||||
for (i = 0; i < cc->tfms_count; i++) {
|
||||
r = crypto_ablkcipher_setkey(cc->tfms[i],
|
||||
cc->key + (i * subkey_size),
|
||||
|
@ -1409,6 +1588,7 @@ static int crypt_ctr_cipher(struct dm_target *ti,
|
|||
return -EINVAL;
|
||||
}
|
||||
cc->key_parts = cc->tfms_count;
|
||||
cc->key_extra_size = 0;
|
||||
|
||||
cc->cipher = kstrdup(cipher, GFP_KERNEL);
|
||||
if (!cc->cipher)
|
||||
|
@ -1460,13 +1640,6 @@ static int crypt_ctr_cipher(struct dm_target *ti,
|
|||
goto bad;
|
||||
}
|
||||
|
||||
/* Initialize and set key */
|
||||
ret = crypt_set_key(cc, key);
|
||||
if (ret < 0) {
|
||||
ti->error = "Error decoding and setting key";
|
||||
goto bad;
|
||||
}
|
||||
|
||||
/* Initialize IV */
|
||||
cc->iv_size = crypto_ablkcipher_ivsize(any_tfm(cc));
|
||||
if (cc->iv_size)
|
||||
|
@ -1493,18 +1666,33 @@ static int crypt_ctr_cipher(struct dm_target *ti,
|
|||
cc->iv_gen_ops = &crypt_iv_null_ops;
|
||||
else if (strcmp(ivmode, "lmk") == 0) {
|
||||
cc->iv_gen_ops = &crypt_iv_lmk_ops;
|
||||
/* Version 2 and 3 is recognised according
|
||||
/*
|
||||
* Version 2 and 3 is recognised according
|
||||
* to length of provided multi-key string.
|
||||
* If present (version 3), last key is used as IV seed.
|
||||
* All keys (including IV seed) are always the same size.
|
||||
*/
|
||||
if (cc->key_size % cc->key_parts)
|
||||
if (cc->key_size % cc->key_parts) {
|
||||
cc->key_parts++;
|
||||
cc->key_extra_size = cc->key_size / cc->key_parts;
|
||||
}
|
||||
} else if (strcmp(ivmode, "tcw") == 0) {
|
||||
cc->iv_gen_ops = &crypt_iv_tcw_ops;
|
||||
cc->key_parts += 2; /* IV + whitening */
|
||||
cc->key_extra_size = cc->iv_size + TCW_WHITENING_SIZE;
|
||||
} else {
|
||||
ret = -EINVAL;
|
||||
ti->error = "Invalid IV mode";
|
||||
goto bad;
|
||||
}
|
||||
|
||||
/* Initialize and set key */
|
||||
ret = crypt_set_key(cc, key);
|
||||
if (ret < 0) {
|
||||
ti->error = "Error decoding and setting key";
|
||||
goto bad;
|
||||
}
|
||||
|
||||
/* Allocate IV */
|
||||
if (cc->iv_gen_ops && cc->iv_gen_ops->ctr) {
|
||||
ret = cc->iv_gen_ops->ctr(cc, ti, ivopts);
|
||||
|
@ -1817,7 +2005,7 @@ static int crypt_iterate_devices(struct dm_target *ti,
|
|||
|
||||
static struct target_type crypt_target = {
|
||||
.name = "crypt",
|
||||
.version = {1, 12, 1},
|
||||
.version = {1, 13, 0},
|
||||
.module = THIS_MODULE,
|
||||
.ctr = crypt_ctr,
|
||||
.dtr = crypt_dtr,
|
||||
|
|
|
@ -57,7 +57,7 @@ struct vers_iter {
|
|||
static struct list_head _name_buckets[NUM_BUCKETS];
|
||||
static struct list_head _uuid_buckets[NUM_BUCKETS];
|
||||
|
||||
static void dm_hash_remove_all(int keep_open_devices);
|
||||
static void dm_hash_remove_all(bool keep_open_devices, bool mark_deferred, bool only_deferred);
|
||||
|
||||
/*
|
||||
* Guards access to both hash tables.
|
||||
|
@ -86,7 +86,7 @@ static int dm_hash_init(void)
|
|||
|
||||
static void dm_hash_exit(void)
|
||||
{
|
||||
dm_hash_remove_all(0);
|
||||
dm_hash_remove_all(false, false, false);
|
||||
}
|
||||
|
||||
/*-----------------------------------------------------------------
|
||||
|
@ -276,7 +276,7 @@ static struct dm_table *__hash_remove(struct hash_cell *hc)
|
|||
return table;
|
||||
}
|
||||
|
||||
static void dm_hash_remove_all(int keep_open_devices)
|
||||
static void dm_hash_remove_all(bool keep_open_devices, bool mark_deferred, bool only_deferred)
|
||||
{
|
||||
int i, dev_skipped;
|
||||
struct hash_cell *hc;
|
||||
|
@ -293,7 +293,8 @@ retry:
|
|||
md = hc->md;
|
||||
dm_get(md);
|
||||
|
||||
if (keep_open_devices && dm_lock_for_deletion(md)) {
|
||||
if (keep_open_devices &&
|
||||
dm_lock_for_deletion(md, mark_deferred, only_deferred)) {
|
||||
dm_put(md);
|
||||
dev_skipped++;
|
||||
continue;
|
||||
|
@ -450,6 +451,11 @@ static struct mapped_device *dm_hash_rename(struct dm_ioctl *param,
|
|||
return md;
|
||||
}
|
||||
|
||||
void dm_deferred_remove(void)
|
||||
{
|
||||
dm_hash_remove_all(true, false, true);
|
||||
}
|
||||
|
||||
/*-----------------------------------------------------------------
|
||||
* Implementation of the ioctl commands
|
||||
*---------------------------------------------------------------*/
|
||||
|
@ -461,7 +467,7 @@ typedef int (*ioctl_fn)(struct dm_ioctl *param, size_t param_size);
|
|||
|
||||
static int remove_all(struct dm_ioctl *param, size_t param_size)
|
||||
{
|
||||
dm_hash_remove_all(1);
|
||||
dm_hash_remove_all(true, !!(param->flags & DM_DEFERRED_REMOVE), false);
|
||||
param->data_size = 0;
|
||||
return 0;
|
||||
}
|
||||
|
@ -683,6 +689,9 @@ static void __dev_status(struct mapped_device *md, struct dm_ioctl *param)
|
|||
if (dm_suspended_md(md))
|
||||
param->flags |= DM_SUSPEND_FLAG;
|
||||
|
||||
if (dm_test_deferred_remove_flag(md))
|
||||
param->flags |= DM_DEFERRED_REMOVE;
|
||||
|
||||
param->dev = huge_encode_dev(disk_devt(disk));
|
||||
|
||||
/*
|
||||
|
@ -832,8 +841,13 @@ static int dev_remove(struct dm_ioctl *param, size_t param_size)
|
|||
/*
|
||||
* Ensure the device is not open and nothing further can open it.
|
||||
*/
|
||||
r = dm_lock_for_deletion(md);
|
||||
r = dm_lock_for_deletion(md, !!(param->flags & DM_DEFERRED_REMOVE), false);
|
||||
if (r) {
|
||||
if (r == -EBUSY && param->flags & DM_DEFERRED_REMOVE) {
|
||||
up_write(&_hash_lock);
|
||||
dm_put(md);
|
||||
return 0;
|
||||
}
|
||||
DMDEBUG_LIMIT("unable to remove open device %s", hc->name);
|
||||
up_write(&_hash_lock);
|
||||
dm_put(md);
|
||||
|
@ -848,6 +862,8 @@ static int dev_remove(struct dm_ioctl *param, size_t param_size)
|
|||
dm_table_destroy(t);
|
||||
}
|
||||
|
||||
param->flags &= ~DM_DEFERRED_REMOVE;
|
||||
|
||||
if (!dm_kobject_uevent(md, KOBJ_REMOVE, param->event_nr))
|
||||
param->flags |= DM_UEVENT_GENERATED_FLAG;
|
||||
|
||||
|
@ -1469,6 +1485,14 @@ static int message_for_md(struct mapped_device *md, unsigned argc, char **argv,
|
|||
if (**argv != '@')
|
||||
return 2; /* no '@' prefix, deliver to target */
|
||||
|
||||
if (!strcasecmp(argv[0], "@cancel_deferred_remove")) {
|
||||
if (argc != 1) {
|
||||
DMERR("Invalid arguments for @cancel_deferred_remove");
|
||||
return -EINVAL;
|
||||
}
|
||||
return dm_cancel_deferred_remove(md);
|
||||
}
|
||||
|
||||
r = dm_stats_message(md, argc, argv, result, maxlen);
|
||||
if (r < 2)
|
||||
return r;
|
||||
|
|
|
@ -87,6 +87,7 @@ struct multipath {
|
|||
unsigned queue_if_no_path:1; /* Queue I/O if last path fails? */
|
||||
unsigned saved_queue_if_no_path:1; /* Saved state during suspension */
|
||||
unsigned retain_attached_hw_handler:1; /* If there's already a hw_handler present, don't change it. */
|
||||
unsigned pg_init_disabled:1; /* pg_init is not currently allowed */
|
||||
|
||||
unsigned pg_init_retries; /* Number of times to retry pg_init */
|
||||
unsigned pg_init_count; /* Number of times pg_init called */
|
||||
|
@ -390,13 +391,16 @@ static int map_io(struct multipath *m, struct request *clone,
|
|||
if (was_queued)
|
||||
m->queue_size--;
|
||||
|
||||
if ((pgpath && m->queue_io) ||
|
||||
if (m->pg_init_required) {
|
||||
if (!m->pg_init_in_progress)
|
||||
queue_work(kmultipathd, &m->process_queued_ios);
|
||||
r = DM_MAPIO_REQUEUE;
|
||||
} else if ((pgpath && m->queue_io) ||
|
||||
(!pgpath && m->queue_if_no_path)) {
|
||||
/* Queue for the daemon to resubmit */
|
||||
list_add_tail(&clone->queuelist, &m->queued_ios);
|
||||
m->queue_size++;
|
||||
if ((m->pg_init_required && !m->pg_init_in_progress) ||
|
||||
!m->queue_io)
|
||||
if (!m->queue_io)
|
||||
queue_work(kmultipathd, &m->process_queued_ios);
|
||||
pgpath = NULL;
|
||||
r = DM_MAPIO_SUBMITTED;
|
||||
|
@ -497,7 +501,8 @@ static void process_queued_ios(struct work_struct *work)
|
|||
(!pgpath && !m->queue_if_no_path))
|
||||
must_queue = 0;
|
||||
|
||||
if (m->pg_init_required && !m->pg_init_in_progress && pgpath)
|
||||
if (m->pg_init_required && !m->pg_init_in_progress && pgpath &&
|
||||
!m->pg_init_disabled)
|
||||
__pg_init_all_paths(m);
|
||||
|
||||
spin_unlock_irqrestore(&m->lock, flags);
|
||||
|
@ -942,10 +947,20 @@ static void multipath_wait_for_pg_init_completion(struct multipath *m)
|
|||
|
||||
static void flush_multipath_work(struct multipath *m)
|
||||
{
|
||||
unsigned long flags;
|
||||
|
||||
spin_lock_irqsave(&m->lock, flags);
|
||||
m->pg_init_disabled = 1;
|
||||
spin_unlock_irqrestore(&m->lock, flags);
|
||||
|
||||
flush_workqueue(kmpath_handlerd);
|
||||
multipath_wait_for_pg_init_completion(m);
|
||||
flush_workqueue(kmultipathd);
|
||||
flush_work(&m->trigger_event);
|
||||
|
||||
spin_lock_irqsave(&m->lock, flags);
|
||||
m->pg_init_disabled = 0;
|
||||
spin_unlock_irqrestore(&m->lock, flags);
|
||||
}
|
||||
|
||||
static void multipath_dtr(struct dm_target *ti)
|
||||
|
@ -1164,7 +1179,7 @@ static int pg_init_limit_reached(struct multipath *m, struct pgpath *pgpath)
|
|||
|
||||
spin_lock_irqsave(&m->lock, flags);
|
||||
|
||||
if (m->pg_init_count <= m->pg_init_retries)
|
||||
if (m->pg_init_count <= m->pg_init_retries && !m->pg_init_disabled)
|
||||
m->pg_init_required = 1;
|
||||
else
|
||||
limit_reached = 1;
|
||||
|
@ -1665,6 +1680,11 @@ static int multipath_busy(struct dm_target *ti)
|
|||
|
||||
spin_lock_irqsave(&m->lock, flags);
|
||||
|
||||
/* pg_init in progress, requeue until done */
|
||||
if (m->pg_init_in_progress) {
|
||||
busy = 1;
|
||||
goto out;
|
||||
}
|
||||
/* Guess which priority_group will be used at next mapping time */
|
||||
if (unlikely(!m->current_pgpath && m->next_pg))
|
||||
pg = m->next_pg;
|
||||
|
@ -1714,7 +1734,7 @@ out:
|
|||
*---------------------------------------------------------------*/
|
||||
static struct target_type multipath_target = {
|
||||
.name = "multipath",
|
||||
.version = {1, 5, 1},
|
||||
.version = {1, 6, 0},
|
||||
.module = THIS_MODULE,
|
||||
.ctr = multipath_ctr,
|
||||
.dtr = multipath_dtr,
|
||||
|
|
|
@ -545,14 +545,28 @@ static int adjoin(struct dm_table *table, struct dm_target *ti)
|
|||
|
||||
/*
|
||||
* Used to dynamically allocate the arg array.
|
||||
*
|
||||
* We do first allocation with GFP_NOIO because dm-mpath and dm-thin must
|
||||
* process messages even if some device is suspended. These messages have a
|
||||
* small fixed number of arguments.
|
||||
*
|
||||
* On the other hand, dm-switch needs to process bulk data using messages and
|
||||
* excessive use of GFP_NOIO could cause trouble.
|
||||
*/
|
||||
static char **realloc_argv(unsigned *array_size, char **old_argv)
|
||||
{
|
||||
char **argv;
|
||||
unsigned new_size;
|
||||
gfp_t gfp;
|
||||
|
||||
new_size = *array_size ? *array_size * 2 : 64;
|
||||
argv = kmalloc(new_size * sizeof(*argv), GFP_KERNEL);
|
||||
if (*array_size) {
|
||||
new_size = *array_size * 2;
|
||||
gfp = GFP_KERNEL;
|
||||
} else {
|
||||
new_size = 8;
|
||||
gfp = GFP_NOIO;
|
||||
}
|
||||
argv = kmalloc(new_size * sizeof(*argv), gfp);
|
||||
if (argv) {
|
||||
memcpy(argv, old_argv, *array_size * sizeof(*argv));
|
||||
*array_size = new_size;
|
||||
|
@ -1548,9 +1562,12 @@ int dm_table_resume_targets(struct dm_table *t)
|
|||
continue;
|
||||
|
||||
r = ti->type->preresume(ti);
|
||||
if (r)
|
||||
if (r) {
|
||||
DMERR("%s: %s: preresume failed, error = %d",
|
||||
dm_device_name(t->md), ti->type->name, r);
|
||||
return r;
|
||||
}
|
||||
}
|
||||
|
||||
for (i = 0; i < t->num_targets; i++) {
|
||||
struct dm_target *ti = t->targets + i;
|
||||
|
|
|
@ -49,6 +49,11 @@ static unsigned int _major = 0;
|
|||
static DEFINE_IDR(_minor_idr);
|
||||
|
||||
static DEFINE_SPINLOCK(_minor_lock);
|
||||
|
||||
static void do_deferred_remove(struct work_struct *w);
|
||||
|
||||
static DECLARE_WORK(deferred_remove_work, do_deferred_remove);
|
||||
|
||||
/*
|
||||
* For bio-based dm.
|
||||
* One of these is allocated per bio.
|
||||
|
@ -116,6 +121,7 @@ EXPORT_SYMBOL_GPL(dm_get_rq_mapinfo);
|
|||
#define DMF_DELETING 4
|
||||
#define DMF_NOFLUSH_SUSPENDING 5
|
||||
#define DMF_MERGE_IS_OPTIONAL 6
|
||||
#define DMF_DEFERRED_REMOVE 7
|
||||
|
||||
/*
|
||||
* A dummy definition to make RCU happy.
|
||||
|
@ -299,6 +305,8 @@ out_free_io_cache:
|
|||
|
||||
static void local_exit(void)
|
||||
{
|
||||
flush_scheduled_work();
|
||||
|
||||
kmem_cache_destroy(_rq_tio_cache);
|
||||
kmem_cache_destroy(_io_cache);
|
||||
unregister_blkdev(_major, _name);
|
||||
|
@ -404,7 +412,10 @@ static void dm_blk_close(struct gendisk *disk, fmode_t mode)
|
|||
|
||||
spin_lock(&_minor_lock);
|
||||
|
||||
atomic_dec(&md->open_count);
|
||||
if (atomic_dec_and_test(&md->open_count) &&
|
||||
(test_bit(DMF_DEFERRED_REMOVE, &md->flags)))
|
||||
schedule_work(&deferred_remove_work);
|
||||
|
||||
dm_put(md);
|
||||
|
||||
spin_unlock(&_minor_lock);
|
||||
|
@ -418,14 +429,18 @@ int dm_open_count(struct mapped_device *md)
|
|||
/*
|
||||
* Guarantees nothing is using the device before it's deleted.
|
||||
*/
|
||||
int dm_lock_for_deletion(struct mapped_device *md)
|
||||
int dm_lock_for_deletion(struct mapped_device *md, bool mark_deferred, bool only_deferred)
|
||||
{
|
||||
int r = 0;
|
||||
|
||||
spin_lock(&_minor_lock);
|
||||
|
||||
if (dm_open_count(md))
|
||||
if (dm_open_count(md)) {
|
||||
r = -EBUSY;
|
||||
if (mark_deferred)
|
||||
set_bit(DMF_DEFERRED_REMOVE, &md->flags);
|
||||
} else if (only_deferred && !test_bit(DMF_DEFERRED_REMOVE, &md->flags))
|
||||
r = -EEXIST;
|
||||
else
|
||||
set_bit(DMF_DELETING, &md->flags);
|
||||
|
||||
|
@ -434,6 +449,27 @@ int dm_lock_for_deletion(struct mapped_device *md)
|
|||
return r;
|
||||
}
|
||||
|
||||
int dm_cancel_deferred_remove(struct mapped_device *md)
|
||||
{
|
||||
int r = 0;
|
||||
|
||||
spin_lock(&_minor_lock);
|
||||
|
||||
if (test_bit(DMF_DELETING, &md->flags))
|
||||
r = -EBUSY;
|
||||
else
|
||||
clear_bit(DMF_DEFERRED_REMOVE, &md->flags);
|
||||
|
||||
spin_unlock(&_minor_lock);
|
||||
|
||||
return r;
|
||||
}
|
||||
|
||||
static void do_deferred_remove(struct work_struct *w)
|
||||
{
|
||||
dm_deferred_remove();
|
||||
}
|
||||
|
||||
sector_t dm_get_size(struct mapped_device *md)
|
||||
{
|
||||
return get_capacity(md->disk);
|
||||
|
@ -2894,6 +2930,11 @@ int dm_suspended_md(struct mapped_device *md)
|
|||
return test_bit(DMF_SUSPENDED, &md->flags);
|
||||
}
|
||||
|
||||
int dm_test_deferred_remove_flag(struct mapped_device *md)
|
||||
{
|
||||
return test_bit(DMF_DEFERRED_REMOVE, &md->flags);
|
||||
}
|
||||
|
||||
int dm_suspended(struct dm_target *ti)
|
||||
{
|
||||
return dm_suspended_md(dm_table_get_md(ti->table));
|
||||
|
|
|
@ -128,6 +128,16 @@ int dm_deleting_md(struct mapped_device *md);
|
|||
*/
|
||||
int dm_suspended_md(struct mapped_device *md);
|
||||
|
||||
/*
|
||||
* Test if the device is scheduled for deferred remove.
|
||||
*/
|
||||
int dm_test_deferred_remove_flag(struct mapped_device *md);
|
||||
|
||||
/*
|
||||
* Try to remove devices marked for deferred removal.
|
||||
*/
|
||||
void dm_deferred_remove(void);
|
||||
|
||||
/*
|
||||
* The device-mapper can be driven through one of two interfaces;
|
||||
* ioctl or filesystem, depending which patch you have applied.
|
||||
|
@ -158,7 +168,8 @@ void dm_stripe_exit(void);
|
|||
void dm_destroy(struct mapped_device *md);
|
||||
void dm_destroy_immediate(struct mapped_device *md);
|
||||
int dm_open_count(struct mapped_device *md);
|
||||
int dm_lock_for_deletion(struct mapped_device *md);
|
||||
int dm_lock_for_deletion(struct mapped_device *md, bool mark_deferred, bool only_deferred);
|
||||
int dm_cancel_deferred_remove(struct mapped_device *md);
|
||||
int dm_request_based(struct mapped_device *md);
|
||||
sector_t dm_get_size(struct mapped_device *md);
|
||||
struct dm_stats *dm_get_stats(struct mapped_device *md);
|
||||
|
|
|
@ -509,15 +509,18 @@ static int grow_add_tail_block(struct resize *resize)
|
|||
static int grow_needs_more_blocks(struct resize *resize)
|
||||
{
|
||||
int r;
|
||||
unsigned old_nr_blocks = resize->old_nr_full_blocks;
|
||||
|
||||
if (resize->old_nr_entries_in_last_block > 0) {
|
||||
old_nr_blocks++;
|
||||
|
||||
r = grow_extend_tail_block(resize, resize->max_entries);
|
||||
if (r)
|
||||
return r;
|
||||
}
|
||||
|
||||
r = insert_full_ablocks(resize->info, resize->size_of_block,
|
||||
resize->old_nr_full_blocks,
|
||||
old_nr_blocks,
|
||||
resize->new_nr_full_blocks,
|
||||
resize->max_entries, resize->value,
|
||||
&resize->root);
|
||||
|
|
|
@ -140,26 +140,10 @@ static int sm_disk_inc_block(struct dm_space_map *sm, dm_block_t b)
|
|||
|
||||
static int sm_disk_dec_block(struct dm_space_map *sm, dm_block_t b)
|
||||
{
|
||||
int r;
|
||||
uint32_t old_count;
|
||||
enum allocation_event ev;
|
||||
struct sm_disk *smd = container_of(sm, struct sm_disk, sm);
|
||||
|
||||
r = sm_ll_dec(&smd->ll, b, &ev);
|
||||
if (!r && (ev == SM_FREE)) {
|
||||
/*
|
||||
* It's only free if it's also free in the last
|
||||
* transaction.
|
||||
*/
|
||||
r = sm_ll_lookup(&smd->old_ll, b, &old_count);
|
||||
if (r)
|
||||
return r;
|
||||
|
||||
if (!old_count)
|
||||
smd->nr_allocated_this_transaction--;
|
||||
}
|
||||
|
||||
return r;
|
||||
return sm_ll_dec(&smd->ll, b, &ev);
|
||||
}
|
||||
|
||||
static int sm_disk_new_block(struct dm_space_map *sm, dm_block_t *b)
|
||||
|
|
|
@ -267,9 +267,9 @@ enum {
|
|||
#define DM_DEV_SET_GEOMETRY _IOWR(DM_IOCTL, DM_DEV_SET_GEOMETRY_CMD, struct dm_ioctl)
|
||||
|
||||
#define DM_VERSION_MAJOR 4
|
||||
#define DM_VERSION_MINOR 26
|
||||
#define DM_VERSION_MINOR 27
|
||||
#define DM_VERSION_PATCHLEVEL 0
|
||||
#define DM_VERSION_EXTRA "-ioctl (2013-08-15)"
|
||||
#define DM_VERSION_EXTRA "-ioctl (2013-10-30)"
|
||||
|
||||
/* Status bits */
|
||||
#define DM_READONLY_FLAG (1 << 0) /* In/Out */
|
||||
|
@ -341,4 +341,15 @@ enum {
|
|||
*/
|
||||
#define DM_DATA_OUT_FLAG (1 << 16) /* Out */
|
||||
|
||||
/*
|
||||
* If set with DM_DEV_REMOVE or DM_REMOVE_ALL this indicates that if
|
||||
* the device cannot be removed immediately because it is still in use
|
||||
* it should instead be scheduled for removal when it gets closed.
|
||||
*
|
||||
* On return from DM_DEV_REMOVE, DM_DEV_STATUS or other ioctls, this
|
||||
* flag indicates that the device is scheduled to be removed when it
|
||||
* gets closed.
|
||||
*/
|
||||
#define DM_DEFERRED_REMOVE (1 << 17) /* In/Out */
|
||||
|
||||
#endif /* _LINUX_DM_IOCTL_H */
|
||||
|
|
Загрузка…
Ссылка в новой задаче