The initial version of TCMU (in 3.18) does not properly handle
bidirectional SCSI commands -- those with both an in and out buffer. In
looking to fix this it also became clear that TCMU's support for adding
new types of entries (opcodes) to the command ring was broken. We need
to fix this now, so that future issues can be handled properly by adding
new opcodes.

We make the most of this ABI break by enabling bidi cmd handling within
TCMP_OP_CMD opcode. Add an iov_bidi_cnt field to tcmu_cmd_entry.req.
This enables TCMU to describe bidi commands, but further kernel work is
needed for full bidi support.

Enlarge tcmu_cmd_entry_hdr by 32 bits by pulling in cmd_id and __pad1. Turn
__pad1 into two 8 bit flags fields, for kernel-set and userspace-set flags,
"kflags" and "uflags" respectively.

Update version fields so userspace can tell the interface is changed.

Update tcmu-design.txt with details of how new stuff works:
- Specify an additional requirement for userspace to set UNKNOWN_OP
  (bit 0) in hdr.uflags for unknown/unhandled opcodes.
- Define how Data-In and Data-Out fields are described in req.iov[]

Changed in v2:
- Change name of SKIPPED bit to UNKNOWN bit
- PAD op does not set the bit any more
- Change len_op helper functions to take just len_op, not the whole struct
- Change version to 2 in missed spots, and use defines
- Add 16 unused bytes to cmd_entry.req, in case additional SAM cmd
  parameters need to be included
- Add iov_dif_cnt field to specify buffers used for DIF info in iov[]
- Rearrange fields to naturally align cdb_off
- Handle if userspace sets UNKNOWN_OP by indicating failure of the cmd
- Wrap some overly long UPDATE_HEAD lines

(Add missing req.iov_bidi_cnt + req.iov_dif_cnt zeroing - Ilias)

Signed-off-by: Andy Grover <agrover@redhat.com>
Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This commit is contained in:
Andy Grover 2015-04-14 17:30:04 -07:00 коммит произвёл Nicholas Bellinger
Родитель 65204c84d7
Коммит 0ad46af8a6
3 изменённых файлов: 87 добавлений и 42 удалений

Просмотреть файл

@ -138,27 +138,40 @@ signals the kernel via a 4-byte write(). When cmd_head equals
cmd_tail, the ring is empty -- no commands are currently waiting to be cmd_tail, the ring is empty -- no commands are currently waiting to be
processed by userspace. processed by userspace.
TCMU commands start with a common header containing "len_op", a 32-bit TCMU commands are 8-byte aligned. They start with a common header
value that stores the length, as well as the opcode in the lowest containing "len_op", a 32-bit value that stores the length, as well as
unused bits. Currently only two opcodes are defined, TCMU_OP_PAD and the opcode in the lowest unused bits. It also contains cmd_id and
TCMU_OP_CMD. When userspace encounters a command with PAD opcode, it flags fields for setting by the kernel (kflags) and userspace
should skip ahead by the bytes in "length". (The kernel inserts PAD (uflags).
entries to ensure each CMD entry fits contigously into the circular
buffer.)
When userspace handles a CMD, it finds the SCSI CDB (Command Data Currently only two opcodes are defined, TCMU_OP_CMD and TCMU_OP_PAD.
Block) via tcmu_cmd_entry.req.cdb_off. This is an offset from the
start of the overall shared memory region, not the entry. The data
in/out buffers are accessible via tht req.iov[] array. Note that
each iov.iov_base is also an offset from the start of the region.
TCMU currently does not support BIDI operations. When the opcode is CMD, the entry in the command ring is a struct
tcmu_cmd_entry. Userspace finds the SCSI CDB (Command Data Block) via
tcmu_cmd_entry.req.cdb_off. This is an offset from the start of the
overall shared memory region, not the entry. The data in/out buffers
are accessible via tht req.iov[] array. iov_cnt contains the number of
entries in iov[] needed to describe either the Data-In or Data-Out
buffers. For bidirectional commands, iov_cnt specifies how many iovec
entries cover the Data-Out area, and iov_bidi_count specifies how many
iovec entries immediately after that in iov[] cover the Data-In
area. Just like other fields, iov.iov_base is an offset from the start
of the region.
When completing a command, userspace sets rsp.scsi_status, and When completing a command, userspace sets rsp.scsi_status, and
rsp.sense_buffer if necessary. Userspace then increments rsp.sense_buffer if necessary. Userspace then increments
mailbox.cmd_tail by entry.hdr.length (mod cmdr_size) and signals the mailbox.cmd_tail by entry.hdr.length (mod cmdr_size) and signals the
kernel via the UIO method, a 4-byte write to the file descriptor. kernel via the UIO method, a 4-byte write to the file descriptor.
When the opcode is PAD, userspace only updates cmd_tail as above --
it's a no-op. (The kernel inserts PAD entries to ensure each CMD entry
is contiguous within the command ring.)
More opcodes may be added in the future. If userspace encounters an
opcode it does not handle, it must set UNKNOWN_OP bit (bit 0) in
hdr.uflags, update cmd_tail, and proceed with processing additional
commands, if any.
The Data Area: The Data Area:
This is shared-memory space after the command ring. The organization This is shared-memory space after the command ring. The organization

Просмотреть файл

@ -344,8 +344,11 @@ static int tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
entry = (void *) mb + CMDR_OFF + cmd_head; entry = (void *) mb + CMDR_OFF + cmd_head;
tcmu_flush_dcache_range(entry, sizeof(*entry)); tcmu_flush_dcache_range(entry, sizeof(*entry));
tcmu_hdr_set_op(&entry->hdr, TCMU_OP_PAD); tcmu_hdr_set_op(&entry->hdr.len_op, TCMU_OP_PAD);
tcmu_hdr_set_len(&entry->hdr, pad_size); tcmu_hdr_set_len(&entry->hdr.len_op, pad_size);
entry->hdr.cmd_id = 0; /* not used for PAD */
entry->hdr.kflags = 0;
entry->hdr.uflags = 0;
UPDATE_HEAD(mb->cmd_head, pad_size, udev->cmdr_size); UPDATE_HEAD(mb->cmd_head, pad_size, udev->cmdr_size);
@ -355,9 +358,11 @@ static int tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
entry = (void *) mb + CMDR_OFF + cmd_head; entry = (void *) mb + CMDR_OFF + cmd_head;
tcmu_flush_dcache_range(entry, sizeof(*entry)); tcmu_flush_dcache_range(entry, sizeof(*entry));
tcmu_hdr_set_op(&entry->hdr, TCMU_OP_CMD); tcmu_hdr_set_op(&entry->hdr.len_op, TCMU_OP_CMD);
tcmu_hdr_set_len(&entry->hdr, command_size); tcmu_hdr_set_len(&entry->hdr.len_op, command_size);
entry->cmd_id = tcmu_cmd->cmd_id; entry->hdr.cmd_id = tcmu_cmd->cmd_id;
entry->hdr.kflags = 0;
entry->hdr.uflags = 0;
/* /*
* Fix up iovecs, and handle if allocation in data ring wrapped. * Fix up iovecs, and handle if allocation in data ring wrapped.
@ -407,6 +412,8 @@ static int tcmu_queue_cmd_ring(struct tcmu_cmd *tcmu_cmd)
kunmap_atomic(from); kunmap_atomic(from);
} }
entry->req.iov_cnt = iov_cnt; entry->req.iov_cnt = iov_cnt;
entry->req.iov_bidi_cnt = 0;
entry->req.iov_dif_cnt = 0;
/* All offsets relative to mb_addr, not start of entry! */ /* All offsets relative to mb_addr, not start of entry! */
cdb_off = CMDR_OFF + cmd_head + base_command_size; cdb_off = CMDR_OFF + cmd_head + base_command_size;
@ -464,6 +471,17 @@ static void tcmu_handle_completion(struct tcmu_cmd *cmd, struct tcmu_cmd_entry *
return; return;
} }
if (entry->hdr.uflags & TCMU_UFLAG_UNKNOWN_OP) {
UPDATE_HEAD(udev->data_tail, cmd->data_length, udev->data_size);
pr_warn("TCMU: Userspace set UNKNOWN_OP flag on se_cmd %p\n",
cmd->se_cmd);
transport_generic_request_failure(cmd->se_cmd,
TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE);
cmd->se_cmd = NULL;
kmem_cache_free(tcmu_cmd_cache, cmd);
return;
}
if (entry->rsp.scsi_status == SAM_STAT_CHECK_CONDITION) { if (entry->rsp.scsi_status == SAM_STAT_CHECK_CONDITION) {
memcpy(se_cmd->sense_buffer, entry->rsp.sense_buffer, memcpy(se_cmd->sense_buffer, entry->rsp.sense_buffer,
se_cmd->scsi_sense_length); se_cmd->scsi_sense_length);
@ -542,14 +560,16 @@ static unsigned int tcmu_handle_completions(struct tcmu_dev *udev)
tcmu_flush_dcache_range(entry, sizeof(*entry)); tcmu_flush_dcache_range(entry, sizeof(*entry));
if (tcmu_hdr_get_op(&entry->hdr) == TCMU_OP_PAD) { if (tcmu_hdr_get_op(entry->hdr.len_op) == TCMU_OP_PAD) {
UPDATE_HEAD(udev->cmdr_last_cleaned, tcmu_hdr_get_len(&entry->hdr), udev->cmdr_size); UPDATE_HEAD(udev->cmdr_last_cleaned,
tcmu_hdr_get_len(entry->hdr.len_op),
udev->cmdr_size);
continue; continue;
} }
WARN_ON(tcmu_hdr_get_op(&entry->hdr) != TCMU_OP_CMD); WARN_ON(tcmu_hdr_get_op(entry->hdr.len_op) != TCMU_OP_CMD);
spin_lock(&udev->commands_lock); spin_lock(&udev->commands_lock);
cmd = idr_find(&udev->commands, entry->cmd_id); cmd = idr_find(&udev->commands, entry->hdr.cmd_id);
if (cmd) if (cmd)
idr_remove(&udev->commands, cmd->cmd_id); idr_remove(&udev->commands, cmd->cmd_id);
spin_unlock(&udev->commands_lock); spin_unlock(&udev->commands_lock);
@ -562,7 +582,9 @@ static unsigned int tcmu_handle_completions(struct tcmu_dev *udev)
tcmu_handle_completion(cmd, entry); tcmu_handle_completion(cmd, entry);
UPDATE_HEAD(udev->cmdr_last_cleaned, tcmu_hdr_get_len(&entry->hdr), udev->cmdr_size); UPDATE_HEAD(udev->cmdr_last_cleaned,
tcmu_hdr_get_len(entry->hdr.len_op),
udev->cmdr_size);
handled++; handled++;
} }
@ -840,14 +862,14 @@ static int tcmu_configure_device(struct se_device *dev)
udev->data_size = TCMU_RING_SIZE - CMDR_SIZE; udev->data_size = TCMU_RING_SIZE - CMDR_SIZE;
mb = udev->mb_addr; mb = udev->mb_addr;
mb->version = 1; mb->version = TCMU_MAILBOX_VERSION;
mb->cmdr_off = CMDR_OFF; mb->cmdr_off = CMDR_OFF;
mb->cmdr_size = udev->cmdr_size; mb->cmdr_size = udev->cmdr_size;
WARN_ON(!PAGE_ALIGNED(udev->data_off)); WARN_ON(!PAGE_ALIGNED(udev->data_off));
WARN_ON(udev->data_size % PAGE_SIZE); WARN_ON(udev->data_size % PAGE_SIZE);
info->version = "1"; info->version = xstr(TCMU_MAILBOX_VERSION);
info->mem[0].name = "tcm-user command & data buffer"; info->mem[0].name = "tcm-user command & data buffer";
info->mem[0].addr = (phys_addr_t) udev->mb_addr; info->mem[0].addr = (phys_addr_t) udev->mb_addr;

Просмотреть файл

@ -6,7 +6,7 @@
#include <linux/types.h> #include <linux/types.h>
#include <linux/uio.h> #include <linux/uio.h>
#define TCMU_VERSION "1.0" #define TCMU_VERSION "2.0"
/* /*
* Ring Design * Ring Design
@ -39,9 +39,13 @@
* should process the next packet the same way, and so on. * should process the next packet the same way, and so on.
*/ */
#define TCMU_MAILBOX_VERSION 1 #define TCMU_MAILBOX_VERSION 2
#define ALIGN_SIZE 64 /* Should be enough for most CPUs */ #define ALIGN_SIZE 64 /* Should be enough for most CPUs */
/* See https://gcc.gnu.org/onlinedocs/cpp/Stringification.html */
#define xstr(s) str(s)
#define str(s) #s
struct tcmu_mailbox { struct tcmu_mailbox {
__u16 version; __u16 version;
__u16 flags; __u16 flags;
@ -64,31 +68,36 @@ enum tcmu_opcode {
* Only a few opcodes, and length is 8-byte aligned, so use low bits for opcode. * Only a few opcodes, and length is 8-byte aligned, so use low bits for opcode.
*/ */
struct tcmu_cmd_entry_hdr { struct tcmu_cmd_entry_hdr {
__u32 len_op; __u32 len_op;
__u16 cmd_id;
__u8 kflags;
#define TCMU_UFLAG_UNKNOWN_OP 0x1
__u8 uflags;
} __packed; } __packed;
#define TCMU_OP_MASK 0x7 #define TCMU_OP_MASK 0x7
static inline enum tcmu_opcode tcmu_hdr_get_op(struct tcmu_cmd_entry_hdr *hdr) static inline enum tcmu_opcode tcmu_hdr_get_op(__u32 len_op)
{ {
return hdr->len_op & TCMU_OP_MASK; return len_op & TCMU_OP_MASK;
} }
static inline void tcmu_hdr_set_op(struct tcmu_cmd_entry_hdr *hdr, enum tcmu_opcode op) static inline void tcmu_hdr_set_op(__u32 *len_op, enum tcmu_opcode op)
{ {
hdr->len_op &= ~TCMU_OP_MASK; *len_op &= ~TCMU_OP_MASK;
hdr->len_op |= (op & TCMU_OP_MASK); *len_op |= (op & TCMU_OP_MASK);
} }
static inline __u32 tcmu_hdr_get_len(struct tcmu_cmd_entry_hdr *hdr) static inline __u32 tcmu_hdr_get_len(__u32 len_op)
{ {
return hdr->len_op & ~TCMU_OP_MASK; return len_op & ~TCMU_OP_MASK;
} }
static inline void tcmu_hdr_set_len(struct tcmu_cmd_entry_hdr *hdr, __u32 len) static inline void tcmu_hdr_set_len(__u32 *len_op, __u32 len)
{ {
hdr->len_op &= TCMU_OP_MASK; *len_op &= TCMU_OP_MASK;
hdr->len_op |= len; *len_op |= len;
} }
/* Currently the same as SCSI_SENSE_BUFFERSIZE */ /* Currently the same as SCSI_SENSE_BUFFERSIZE */
@ -97,13 +106,14 @@ static inline void tcmu_hdr_set_len(struct tcmu_cmd_entry_hdr *hdr, __u32 len)
struct tcmu_cmd_entry { struct tcmu_cmd_entry {
struct tcmu_cmd_entry_hdr hdr; struct tcmu_cmd_entry_hdr hdr;
uint16_t cmd_id;
uint16_t __pad1;
union { union {
struct { struct {
uint32_t iov_cnt;
uint32_t iov_bidi_cnt;
uint32_t iov_dif_cnt;
uint64_t cdb_off; uint64_t cdb_off;
uint64_t iov_cnt; uint64_t __pad1;
uint64_t __pad2;
struct iovec iov[0]; struct iovec iov[0];
} req; } req;
struct { struct {