WSL2-Linux-Kernel/include/scsi/libiscsi.h

506 строки
14 KiB
C
Исходник Постоянная ссылка Обычный вид История

/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* iSCSI lib definitions
*
* Copyright (C) 2006 Red Hat, Inc. All rights reserved.
* Copyright (C) 2004 - 2006 Mike Christie
* Copyright (C) 2004 - 2005 Dmitry Yusupov
* Copyright (C) 2004 - 2005 Alex Aizman
*/
#ifndef LIBISCSI_H
#define LIBISCSI_H
#include <linux/types.h>
#include <linux/wait.h>
#include <linux/mutex.h>
#include <linux/timer.h>
#include <linux/workqueue.h>
kfifo: move struct kfifo in place This is a new generic kernel FIFO implementation. The current kernel fifo API is not very widely used, because it has to many constrains. Only 17 files in the current 2.6.31-rc5 used it. FIFO's are like list's a very basic thing and a kfifo API which handles the most use case would save a lot of development time and memory resources. I think this are the reasons why kfifo is not in use: - The API is to simple, important functions are missing - A fifo can be only allocated dynamically - There is a requirement of a spinlock whether you need it or not - There is no support for data records inside a fifo So I decided to extend the kfifo in a more generic way without blowing up the API to much. The new API has the following benefits: - Generic usage: For kernel internal use and/or device driver. - Provide an API for the most use case. - Slim API: The whole API provides 25 functions. - Linux style habit. - DECLARE_KFIFO, DEFINE_KFIFO and INIT_KFIFO Macros - Direct copy_to_user from the fifo and copy_from_user into the fifo. - The kfifo itself is an in place member of the using data structure, this save an indirection access and does not waste the kernel allocator. - Lockless access: if only one reader and one writer is active on the fifo, which is the common use case, no additional locking is necessary. - Remove spinlock - give the user the freedom of choice what kind of locking to use if one is required. - Ability to handle records. Three type of records are supported: - Variable length records between 0-255 bytes, with a record size field of 1 bytes. - Variable length records between 0-65535 bytes, with a record size field of 2 bytes. - Fixed size records, which no record size field. - Preserve memory resource. - Performance! - Easy to use! This patch: Since most users want to have the kfifo as part of another object, reorganize the code to allow including struct kfifo in another data structure. This requires changing the kfifo_alloc and kfifo_init prototypes so that we pass an existing kfifo pointer into them. This patch changes the implementation and all existing users. [akpm@linux-foundation.org: fix warning] Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22 01:37:26 +03:00
#include <linux/kfifo.h>
#include <linux/refcount.h>
#include <scsi/iscsi_proto.h>
#include <scsi/iscsi_if.h>
#include <scsi/scsi_transport_iscsi.h>
struct scsi_transport_template;
struct scsi_host_template;
struct scsi_device;
struct Scsi_Host;
struct scsi_target;
struct scsi_cmnd;
struct socket;
struct iscsi_transport;
struct iscsi_cls_session;
struct iscsi_cls_conn;
struct iscsi_session;
struct iscsi_nopin;
struct device;
#define ISCSI_DEF_XMIT_CMDS_MAX 128 /* must be power of 2 */
#define ISCSI_MGMT_CMDS_MAX 15
#define ISCSI_DEF_CMD_PER_LUN 32
/* Task Mgmt states */
enum {
TMF_INITIAL,
TMF_QUEUED,
TMF_SUCCESS,
TMF_FAILED,
TMF_TIMEDOUT,
TMF_NOT_FOUND,
};
#define ISID_SIZE 6
/* Connection flags */
#define ISCSI_CONN_FLAG_SUSPEND_TX 0
#define ISCSI_CONN_FLAG_SUSPEND_RX 1
#define ISCSI_CONN_FLAG_BOUND 2
#define ISCSI_ITT_MASK 0x1fff
#define ISCSI_TOTAL_CMDS_MAX 4096
/* this must be a power of two greater than ISCSI_MGMT_CMDS_MAX */
#define ISCSI_TOTAL_CMDS_MIN 16
#define ISCSI_AGE_SHIFT 28
#define ISCSI_AGE_MASK 0xf
#define ISCSI_ADDRESS_BUF_LEN 64
enum {
[SCSI] iscsi_tcp, libiscsi: initial AHS Support at libiscsi generic code - currently code assumes a storage space of pdu header is allocated at llds ctask and is pointed to by iscsi_cmd_task->hdr. Here I add a hdr_max field pertaining to that storage, and an hdr_len that accumulates the current use of the pdu-header. - Add an iscsi_next_hdr() inline which returns the next free space to write new Header at. Also iscsi_next_hdr() is used to retrieve the address at which to write the header-digest. - Add iscsi_add_hdr(length). What the user do is calls iscsi_next_hdr() for address of the new header, than calls iscsi_add_hdr(length) with the size of the new header. iscsi_add_hdr() will check if space is available and update to the new size. length must be padded according to standard. - Add 2 padding inline helpers thanks to Olaf. Current patch does not use them but Following patches will. Also moved definition of ISCSI_PAD_LEN to iscsi_proto.h which had PAD_WORD_LEN that was never used anywhere. - Let iscsi_prep_scsi_cmd_pdu() signal an Error return since now it is possible that it will fail. - I was tired of yet again writing a "this is a digest" comment next to sizeof(__u32) so I defined a new ISCSI_DIGEST_SIZE. Now I don't need any comments. Changed all places that used sizeof(__u32) or "4" in connection to a digest. iscsi_tcp specific code - At struct iscsi_tcp_cmd_task allocate maximum space allowed in standard for all headers following the iscsi_cmd header. and mark it so in iscsi_tcp_session_create() - At iscsi_send_cmd_hdr() retrieve the correct headers size and write header digest at iscsi_next_hdr(). Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2007-12-13 21:43:23 +03:00
/* this is the maximum possible storage for AHSs */
ISCSI_MAX_AHS_SIZE = sizeof(struct iscsi_ecdb_ahdr) +
sizeof(struct iscsi_rlength_ahdr),
ISCSI_DIGEST_SIZE = sizeof(__u32),
};
enum {
ISCSI_TASK_FREE,
ISCSI_TASK_COMPLETED,
ISCSI_TASK_PENDING,
ISCSI_TASK_RUNNING,
ISCSI_TASK_ABRT_TMF, /* aborted due to TMF */
ISCSI_TASK_ABRT_SESS_RECOV, /* aborted due to session recovery */
ISCSI_TASK_REQUEUE_SCSIQ, /* qcmd requeueing to scsi-ml */
};
struct iscsi_r2t_info {
__be32 ttt; /* copied from R2T */
__be32 exp_statsn; /* copied from R2T */
uint32_t data_length; /* copied from R2T */
uint32_t data_offset; /* copied from R2T */
int data_count; /* DATA-Out payload progress */
int datasn;
/* LLDs should set/update these values */
int sent; /* R2T sequence progress */
};
struct iscsi_task {
/*
[SCSI] iscsi_tcp, libiscsi: initial AHS Support at libiscsi generic code - currently code assumes a storage space of pdu header is allocated at llds ctask and is pointed to by iscsi_cmd_task->hdr. Here I add a hdr_max field pertaining to that storage, and an hdr_len that accumulates the current use of the pdu-header. - Add an iscsi_next_hdr() inline which returns the next free space to write new Header at. Also iscsi_next_hdr() is used to retrieve the address at which to write the header-digest. - Add iscsi_add_hdr(length). What the user do is calls iscsi_next_hdr() for address of the new header, than calls iscsi_add_hdr(length) with the size of the new header. iscsi_add_hdr() will check if space is available and update to the new size. length must be padded according to standard. - Add 2 padding inline helpers thanks to Olaf. Current patch does not use them but Following patches will. Also moved definition of ISCSI_PAD_LEN to iscsi_proto.h which had PAD_WORD_LEN that was never used anywhere. - Let iscsi_prep_scsi_cmd_pdu() signal an Error return since now it is possible that it will fail. - I was tired of yet again writing a "this is a digest" comment next to sizeof(__u32) so I defined a new ISCSI_DIGEST_SIZE. Now I don't need any comments. Changed all places that used sizeof(__u32) or "4" in connection to a digest. iscsi_tcp specific code - At struct iscsi_tcp_cmd_task allocate maximum space allowed in standard for all headers following the iscsi_cmd header. and mark it so in iscsi_tcp_session_create() - At iscsi_send_cmd_hdr() retrieve the correct headers size and write header digest at iscsi_next_hdr(). Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2007-12-13 21:43:23 +03:00
* Because LLDs allocate their hdr differently, this is a pointer
* and length to that storage. It must be setup at session
* creation time.
*/
struct iscsi_hdr *hdr;
[SCSI] iscsi_tcp, libiscsi: initial AHS Support at libiscsi generic code - currently code assumes a storage space of pdu header is allocated at llds ctask and is pointed to by iscsi_cmd_task->hdr. Here I add a hdr_max field pertaining to that storage, and an hdr_len that accumulates the current use of the pdu-header. - Add an iscsi_next_hdr() inline which returns the next free space to write new Header at. Also iscsi_next_hdr() is used to retrieve the address at which to write the header-digest. - Add iscsi_add_hdr(length). What the user do is calls iscsi_next_hdr() for address of the new header, than calls iscsi_add_hdr(length) with the size of the new header. iscsi_add_hdr() will check if space is available and update to the new size. length must be padded according to standard. - Add 2 padding inline helpers thanks to Olaf. Current patch does not use them but Following patches will. Also moved definition of ISCSI_PAD_LEN to iscsi_proto.h which had PAD_WORD_LEN that was never used anywhere. - Let iscsi_prep_scsi_cmd_pdu() signal an Error return since now it is possible that it will fail. - I was tired of yet again writing a "this is a digest" comment next to sizeof(__u32) so I defined a new ISCSI_DIGEST_SIZE. Now I don't need any comments. Changed all places that used sizeof(__u32) or "4" in connection to a digest. iscsi_tcp specific code - At struct iscsi_tcp_cmd_task allocate maximum space allowed in standard for all headers following the iscsi_cmd header. and mark it so in iscsi_tcp_session_create() - At iscsi_send_cmd_hdr() retrieve the correct headers size and write header digest at iscsi_next_hdr(). Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2007-12-13 21:43:23 +03:00
unsigned short hdr_max;
unsigned short hdr_len; /* accumulated size of hdr used */
/* copied values in case we need to send tmfs */
itt_t hdr_itt;
__be32 cmdsn;
struct scsi_lun lun;
int itt; /* this ITT */
unsigned imm_count; /* imm-data (bytes) */
/* offset in unsolicited stream (bytes); */
struct iscsi_r2t_info unsol_r2t;
char *data; /* mgmt payload */
unsigned data_count;
struct scsi_cmnd *sc; /* associated SCSI cmd*/
struct iscsi_conn *conn; /* used connection */
/* data processing tracking */
unsigned long last_xfer;
unsigned long last_timeout;
bool have_checked_conn;
/* T10 protection information */
bool protected;
/* state set/tested under session->lock */
int state;
refcount_t refcount;
struct list_head running; /* running cmd list */
void *dd_data; /* driver/transport data */
};
/* invalid scsi_task pointer */
#define INVALID_SCSI_TASK (struct iscsi_task *)-1l
static inline int iscsi_task_has_unsol_data(struct iscsi_task *task)
{
return task->unsol_r2t.data_length > task->unsol_r2t.sent;
}
static inline void* iscsi_next_hdr(struct iscsi_task *task)
[SCSI] iscsi_tcp, libiscsi: initial AHS Support at libiscsi generic code - currently code assumes a storage space of pdu header is allocated at llds ctask and is pointed to by iscsi_cmd_task->hdr. Here I add a hdr_max field pertaining to that storage, and an hdr_len that accumulates the current use of the pdu-header. - Add an iscsi_next_hdr() inline which returns the next free space to write new Header at. Also iscsi_next_hdr() is used to retrieve the address at which to write the header-digest. - Add iscsi_add_hdr(length). What the user do is calls iscsi_next_hdr() for address of the new header, than calls iscsi_add_hdr(length) with the size of the new header. iscsi_add_hdr() will check if space is available and update to the new size. length must be padded according to standard. - Add 2 padding inline helpers thanks to Olaf. Current patch does not use them but Following patches will. Also moved definition of ISCSI_PAD_LEN to iscsi_proto.h which had PAD_WORD_LEN that was never used anywhere. - Let iscsi_prep_scsi_cmd_pdu() signal an Error return since now it is possible that it will fail. - I was tired of yet again writing a "this is a digest" comment next to sizeof(__u32) so I defined a new ISCSI_DIGEST_SIZE. Now I don't need any comments. Changed all places that used sizeof(__u32) or "4" in connection to a digest. iscsi_tcp specific code - At struct iscsi_tcp_cmd_task allocate maximum space allowed in standard for all headers following the iscsi_cmd header. and mark it so in iscsi_tcp_session_create() - At iscsi_send_cmd_hdr() retrieve the correct headers size and write header digest at iscsi_next_hdr(). Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2007-12-13 21:43:23 +03:00
{
return (void*)task->hdr + task->hdr_len;
[SCSI] iscsi_tcp, libiscsi: initial AHS Support at libiscsi generic code - currently code assumes a storage space of pdu header is allocated at llds ctask and is pointed to by iscsi_cmd_task->hdr. Here I add a hdr_max field pertaining to that storage, and an hdr_len that accumulates the current use of the pdu-header. - Add an iscsi_next_hdr() inline which returns the next free space to write new Header at. Also iscsi_next_hdr() is used to retrieve the address at which to write the header-digest. - Add iscsi_add_hdr(length). What the user do is calls iscsi_next_hdr() for address of the new header, than calls iscsi_add_hdr(length) with the size of the new header. iscsi_add_hdr() will check if space is available and update to the new size. length must be padded according to standard. - Add 2 padding inline helpers thanks to Olaf. Current patch does not use them but Following patches will. Also moved definition of ISCSI_PAD_LEN to iscsi_proto.h which had PAD_WORD_LEN that was never used anywhere. - Let iscsi_prep_scsi_cmd_pdu() signal an Error return since now it is possible that it will fail. - I was tired of yet again writing a "this is a digest" comment next to sizeof(__u32) so I defined a new ISCSI_DIGEST_SIZE. Now I don't need any comments. Changed all places that used sizeof(__u32) or "4" in connection to a digest. iscsi_tcp specific code - At struct iscsi_tcp_cmd_task allocate maximum space allowed in standard for all headers following the iscsi_cmd header. and mark it so in iscsi_tcp_session_create() - At iscsi_send_cmd_hdr() retrieve the correct headers size and write header digest at iscsi_next_hdr(). Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2007-12-13 21:43:23 +03:00
}
static inline bool iscsi_task_is_completed(struct iscsi_task *task)
{
return task->state == ISCSI_TASK_COMPLETED ||
task->state == ISCSI_TASK_ABRT_TMF ||
task->state == ISCSI_TASK_ABRT_SESS_RECOV;
}
/* Connection's states */
enum {
ISCSI_CONN_INITIAL_STAGE,
ISCSI_CONN_STARTED,
ISCSI_CONN_STOPPED,
ISCSI_CONN_CLEANUP_WAIT,
};
struct iscsi_conn {
struct iscsi_cls_conn *cls_conn; /* ptr to class connection */
void *dd_data; /* iscsi_transport data */
struct iscsi_session *session; /* parent session */
/*
* conn_stop() flag: stop to recover, stop to terminate
*/
int stop_stage;
struct timer_list transport_timer;
unsigned long last_recv;
unsigned long last_ping;
int ping_timeout;
int recv_timeout;
struct iscsi_task *ping_task;
/* iSCSI connection-wide sequencing */
uint32_t exp_statsn;
uint32_t statsn;
/* control data */
int id; /* CID */
int c_stage; /* connection state */
/*
* Preallocated buffer for pdus that have data but do not
* originate from scsi-ml. We never have two pdus using the
* buffer at the same time. It is only allocated to
* the default max recv size because the pdus we support
* should always fit in this buffer
*/
char *data;
struct iscsi_task *login_task; /* mtask used for login/text */
struct iscsi_task *task; /* xmit task in progress */
/* xmit */
/* items must be added/deleted under frwd lock */
struct list_head mgmtqueue; /* mgmt (control) xmit queue */
struct list_head cmdqueue; /* data-path cmd queue */
struct list_head requeue; /* tasks needing another run */
struct work_struct xmitwork; /* per-conn. xmit workqueue */
/* recv */
struct work_struct recvwork;
unsigned long flags; /* ISCSI_CONN_FLAGs */
/* negotiated params */
unsigned max_recv_dlength; /* initiator_max_recv_dsl*/
unsigned max_xmit_dlength; /* target_max_recv_dsl */
int hdrdgst_en;
int datadgst_en;
int ifmarker_en;
int ofmarker_en;
/* values userspace uses to id a conn */
int persistent_port;
char *persistent_address;
unsigned max_segment_size;
unsigned tcp_xmit_wsf;
unsigned tcp_recv_wsf;
uint16_t keepalive_tmo;
uint16_t local_port;
uint8_t tcp_timestamp_stat;
uint8_t tcp_nagle_disable;
uint8_t tcp_wsf_disable;
uint8_t tcp_timer_scale;
uint8_t tcp_timestamp_en;
uint8_t fragment_disable;
uint8_t ipv4_tos;
uint8_t ipv6_traffic_class;
uint8_t ipv6_flow_label;
uint8_t is_fw_assigned_ipv6;
char *local_ipaddr;
/* MIB-statistics */
uint64_t txdata_octets;
uint64_t rxdata_octets;
uint32_t scsicmd_pdus_cnt;
uint32_t dataout_pdus_cnt;
uint32_t scsirsp_pdus_cnt;
uint32_t datain_pdus_cnt;
uint32_t r2t_pdus_cnt;
uint32_t tmfcmd_pdus_cnt;
int32_t tmfrsp_pdus_cnt;
/* custom statistics */
uint32_t eh_abort_cnt;
uint32_t fmr_unalign_cnt;
};
struct iscsi_pool {
kfifo: move struct kfifo in place This is a new generic kernel FIFO implementation. The current kernel fifo API is not very widely used, because it has to many constrains. Only 17 files in the current 2.6.31-rc5 used it. FIFO's are like list's a very basic thing and a kfifo API which handles the most use case would save a lot of development time and memory resources. I think this are the reasons why kfifo is not in use: - The API is to simple, important functions are missing - A fifo can be only allocated dynamically - There is a requirement of a spinlock whether you need it or not - There is no support for data records inside a fifo So I decided to extend the kfifo in a more generic way without blowing up the API to much. The new API has the following benefits: - Generic usage: For kernel internal use and/or device driver. - Provide an API for the most use case. - Slim API: The whole API provides 25 functions. - Linux style habit. - DECLARE_KFIFO, DEFINE_KFIFO and INIT_KFIFO Macros - Direct copy_to_user from the fifo and copy_from_user into the fifo. - The kfifo itself is an in place member of the using data structure, this save an indirection access and does not waste the kernel allocator. - Lockless access: if only one reader and one writer is active on the fifo, which is the common use case, no additional locking is necessary. - Remove spinlock - give the user the freedom of choice what kind of locking to use if one is required. - Ability to handle records. Three type of records are supported: - Variable length records between 0-255 bytes, with a record size field of 1 bytes. - Variable length records between 0-65535 bytes, with a record size field of 2 bytes. - Fixed size records, which no record size field. - Preserve memory resource. - Performance! - Easy to use! This patch: Since most users want to have the kfifo as part of another object, reorganize the code to allow including struct kfifo in another data structure. This requires changing the kfifo_alloc and kfifo_init prototypes so that we pass an existing kfifo pointer into them. This patch changes the implementation and all existing users. [akpm@linux-foundation.org: fix warning] Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-22 01:37:26 +03:00
struct kfifo queue; /* FIFO Queue */
void **pool; /* Pool of elements */
int max; /* Max number of elements */
};
/* Session's states */
enum {
ISCSI_STATE_FREE = 1,
ISCSI_STATE_LOGGED_IN,
ISCSI_STATE_FAILED,
ISCSI_STATE_TERMINATE,
ISCSI_STATE_IN_RECOVERY,
ISCSI_STATE_RECOVERY_FAILED,
ISCSI_STATE_LOGGING_OUT,
};
struct iscsi_session {
struct iscsi_cls_session *cls_session;
/*
* Syncs up the scsi eh thread with the iscsi eh thread when sending
* task management functions. This must be taken before the session
* and recv lock.
*/
struct mutex eh_mutex;
/* abort */
wait_queue_head_t ehwait; /* used in eh_abort() */
struct iscsi_tm tmhdr;
struct timer_list tmf_timer;
int tmf_state; /* see TMF_INITIAL, etc.*/
struct iscsi_task *running_aborted_task;
/* iSCSI session-wide sequencing */
uint32_t cmdsn;
uint32_t exp_cmdsn;
uint32_t max_cmdsn;
/* This tracks the reqs queued into the initiator */
uint32_t queued_cmdsn;
/* configuration */
int abort_timeout;
int lu_reset_timeout;
int tgt_reset_timeout;
int initial_r2t_en;
unsigned short max_r2t;
int imm_data_en;
unsigned first_burst;
unsigned max_burst;
int time2wait;
int time2retain;
int pdu_inorder_en;
int dataseq_inorder_en;
int erl;
int fast_abort;
int tpgt;
char *username;
char *username_in;
char *password;
char *password_in;
char *targetname;
char *targetalias;
char *ifacename;
char *initiatorname;
char *boot_root;
char *boot_nic;
char *boot_target;
char *portal_type;
char *discovery_parent_type;
uint16_t discovery_parent_idx;
uint16_t def_taskmgmt_tmo;
uint16_t tsid;
uint8_t auto_snd_tgt_disable;
uint8_t discovery_sess;
uint8_t chap_auth_en;
uint8_t discovery_logout_en;
uint8_t bidi_chap_en;
uint8_t discovery_auth_optional;
uint8_t isid[ISID_SIZE];
/* control data */
struct iscsi_transport *tt;
struct Scsi_Host *host;
struct iscsi_conn *leadconn; /* leading connection */
[SCSI] libiscsi: Reduce locking contention in fast path Replace the session lock with two locks, a forward lock and a backwards lock named frwd_lock and back_lock respectively. The forward lock protects resources that change while sending a request to the target, such as cmdsn, queued_cmdsn, and allocating task from the commands' pool with kfifo_out. The backward lock protects resources that change while processing a response or in error path, such as cmdsn_exp, cmdsn_max, and returning tasks to the commands' pool with kfifo_in. Under a steady state fast-path situation, that is when one or more processes/threads submit IO to an iscsi device and a single kernel upcall (e.g softirq) is dealing with processing of responses without errors, this patch eliminates the contention between the queuecommand()/request response/scsi_done() flows associated with iscsi sessions. Between the forward and the backward locks exists a strict locking hierarchy. The mutual exclusion zone protected by the forward lock can enclose the mutual exclusion zone protected by the backward lock but not vice versa. For example, in iscsi_conn_teardown or in iscsi_xmit_data when there is a failure and __iscsi_put_task is called, the backward lock is taken while the forward lock is still taken. On the other hand, if in the RX path a nop is to be sent, for example in iscsi_handle_reject or __iscsi_complete_pdu than the forward lock is released and the backward lock is taken for the duration of iscsi_send_nopout, later the backward lock is released and the forward lock is retaken. libiscsi_tcp uses two kernel fifos the r2t pool and the r2t queue. The insertion and deletion from these queues didn't corespond to the assumption taken by the new forward/backwards session locking paradigm. That is, in iscsi_tcp_clenup_task which belongs to the RX (backwards) path, r2t is taken out from r2t queue and inserted to the r2t pool. In iscsi_tcp_get_curr_r2t which belong to the TX (forward) path, r2t is also inserted to the r2t pool and another r2t is pulled from r2t queue. Only in iscsi_tcp_r2t_rsp which is called in the RX path but can requeue to the TX path, r2t is taken from the r2t pool and inserted to the r2t queue. In order to cope with this situation, two spin locks were added, pool2queue and queue2pool. The former protects extracting from the r2t pool and inserting to the r2t queue, and the later protects the extracing from the r2t queue and inserting to the r2t pool. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> [minor fix up to apply cleanly and compile fix] Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-02-07 10:41:38 +04:00
/* Between the forward and the backward locks exists a strict locking
* hierarchy. The mutual exclusion zone protected by the forward lock
* can enclose the mutual exclusion zone protected by the backward lock
* but not vice versa.
*/
spinlock_t frwd_lock; /* protects session state, *
* cmdsn, queued_cmdsn *
* session resources: *
[SCSI] libiscsi: Reduce locking contention in fast path Replace the session lock with two locks, a forward lock and a backwards lock named frwd_lock and back_lock respectively. The forward lock protects resources that change while sending a request to the target, such as cmdsn, queued_cmdsn, and allocating task from the commands' pool with kfifo_out. The backward lock protects resources that change while processing a response or in error path, such as cmdsn_exp, cmdsn_max, and returning tasks to the commands' pool with kfifo_in. Under a steady state fast-path situation, that is when one or more processes/threads submit IO to an iscsi device and a single kernel upcall (e.g softirq) is dealing with processing of responses without errors, this patch eliminates the contention between the queuecommand()/request response/scsi_done() flows associated with iscsi sessions. Between the forward and the backward locks exists a strict locking hierarchy. The mutual exclusion zone protected by the forward lock can enclose the mutual exclusion zone protected by the backward lock but not vice versa. For example, in iscsi_conn_teardown or in iscsi_xmit_data when there is a failure and __iscsi_put_task is called, the backward lock is taken while the forward lock is still taken. On the other hand, if in the RX path a nop is to be sent, for example in iscsi_handle_reject or __iscsi_complete_pdu than the forward lock is released and the backward lock is taken for the duration of iscsi_send_nopout, later the backward lock is released and the forward lock is retaken. libiscsi_tcp uses two kernel fifos the r2t pool and the r2t queue. The insertion and deletion from these queues didn't corespond to the assumption taken by the new forward/backwards session locking paradigm. That is, in iscsi_tcp_clenup_task which belongs to the RX (backwards) path, r2t is taken out from r2t queue and inserted to the r2t pool. In iscsi_tcp_get_curr_r2t which belong to the TX (forward) path, r2t is also inserted to the r2t pool and another r2t is pulled from r2t queue. Only in iscsi_tcp_r2t_rsp which is called in the RX path but can requeue to the TX path, r2t is taken from the r2t pool and inserted to the r2t queue. In order to cope with this situation, two spin locks were added, pool2queue and queue2pool. The former protects extracting from the r2t pool and inserting to the r2t queue, and the later protects the extracing from the r2t queue and inserting to the r2t pool. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> [minor fix up to apply cleanly and compile fix] Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-02-07 10:41:38 +04:00
* - cmdpool kfifo_out , *
* - mgmtpool, queues */
[SCSI] libiscsi: Reduce locking contention in fast path Replace the session lock with two locks, a forward lock and a backwards lock named frwd_lock and back_lock respectively. The forward lock protects resources that change while sending a request to the target, such as cmdsn, queued_cmdsn, and allocating task from the commands' pool with kfifo_out. The backward lock protects resources that change while processing a response or in error path, such as cmdsn_exp, cmdsn_max, and returning tasks to the commands' pool with kfifo_in. Under a steady state fast-path situation, that is when one or more processes/threads submit IO to an iscsi device and a single kernel upcall (e.g softirq) is dealing with processing of responses without errors, this patch eliminates the contention between the queuecommand()/request response/scsi_done() flows associated with iscsi sessions. Between the forward and the backward locks exists a strict locking hierarchy. The mutual exclusion zone protected by the forward lock can enclose the mutual exclusion zone protected by the backward lock but not vice versa. For example, in iscsi_conn_teardown or in iscsi_xmit_data when there is a failure and __iscsi_put_task is called, the backward lock is taken while the forward lock is still taken. On the other hand, if in the RX path a nop is to be sent, for example in iscsi_handle_reject or __iscsi_complete_pdu than the forward lock is released and the backward lock is taken for the duration of iscsi_send_nopout, later the backward lock is released and the forward lock is retaken. libiscsi_tcp uses two kernel fifos the r2t pool and the r2t queue. The insertion and deletion from these queues didn't corespond to the assumption taken by the new forward/backwards session locking paradigm. That is, in iscsi_tcp_clenup_task which belongs to the RX (backwards) path, r2t is taken out from r2t queue and inserted to the r2t pool. In iscsi_tcp_get_curr_r2t which belong to the TX (forward) path, r2t is also inserted to the r2t pool and another r2t is pulled from r2t queue. Only in iscsi_tcp_r2t_rsp which is called in the RX path but can requeue to the TX path, r2t is taken from the r2t pool and inserted to the r2t queue. In order to cope with this situation, two spin locks were added, pool2queue and queue2pool. The former protects extracting from the r2t pool and inserting to the r2t queue, and the later protects the extracing from the r2t queue and inserting to the r2t pool. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> [minor fix up to apply cleanly and compile fix] Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2014-02-07 10:41:38 +04:00
spinlock_t back_lock; /* protects cmdsn_exp *
* cmdsn_max, *
* cmdpool kfifo_in */
int state; /* session state */
int age; /* counts session re-opens */
int scsi_cmds_max; /* max scsi commands */
int cmds_max; /* size of cmds array */
struct iscsi_task **cmds; /* Original Cmds arr */
struct iscsi_pool cmdpool; /* PDU's pool */
void *dd_data; /* LLD private data */
};
enum {
ISCSI_HOST_SETUP,
ISCSI_HOST_REMOVED,
};
struct iscsi_host {
char *initiatorname;
/* hw address or netdev iscsi connection is bound to */
char *hwaddress;
char *netdev;
wait_queue_head_t session_removal_wq;
/* protects sessions and state */
spinlock_t lock;
int num_sessions;
int state;
struct workqueue_struct *workq;
char workq_name[20];
};
/*
* scsi host template
*/
extern int iscsi_eh_abort(struct scsi_cmnd *sc);
extern int iscsi_eh_recover_target(struct scsi_cmnd *sc);
extern int iscsi_eh_session_reset(struct scsi_cmnd *sc);
extern int iscsi_eh_device_reset(struct scsi_cmnd *sc);
extern int iscsi_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *sc);
extern enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc);
/*
* iSCSI host helpers.
*/
#define iscsi_host_priv(_shost) \
(shost_priv(_shost) + sizeof(struct iscsi_host))
extern int iscsi_host_set_param(struct Scsi_Host *shost,
enum iscsi_host_param param, char *buf,
int buflen);
extern int iscsi_host_get_param(struct Scsi_Host *shost,
enum iscsi_host_param param, char *buf);
extern int iscsi_host_add(struct Scsi_Host *shost, struct device *pdev);
extern struct Scsi_Host *iscsi_host_alloc(struct scsi_host_template *sht,
int dd_data_size,
bool xmit_can_sleep);
extern void iscsi_host_remove(struct Scsi_Host *shost, bool is_shutdown);
extern void iscsi_host_free(struct Scsi_Host *shost);
extern int iscsi_target_alloc(struct scsi_target *starget);
extern int iscsi_host_get_max_scsi_cmds(struct Scsi_Host *shost,
uint16_t requested_cmds_max);
/*
* session management
*/
extern struct iscsi_cls_session *
iscsi_session_setup(struct iscsi_transport *, struct Scsi_Host *shost,
uint16_t, int, int, uint32_t, unsigned int);
scsi: iscsi_tcp: Fix UAF during logout when accessing the shost ipaddress [ Upstream commit 6f1d64b13097e85abda0f91b5638000afc5f9a06 ] Bug report and analysis from Ding Hui. During iSCSI session logout, if another task accesses the shost ipaddress attr, we can get a KASAN UAF report like this: [ 276.942144] BUG: KASAN: use-after-free in _raw_spin_lock_bh+0x78/0xe0 [ 276.942535] Write of size 4 at addr ffff8881053b45b8 by task cat/4088 [ 276.943511] CPU: 2 PID: 4088 Comm: cat Tainted: G E 6.1.0-rc8+ #3 [ 276.943997] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [ 276.944470] Call Trace: [ 276.944943] <TASK> [ 276.945397] dump_stack_lvl+0x34/0x48 [ 276.945887] print_address_description.constprop.0+0x86/0x1e7 [ 276.946421] print_report+0x36/0x4f [ 276.947358] kasan_report+0xad/0x130 [ 276.948234] kasan_check_range+0x35/0x1c0 [ 276.948674] _raw_spin_lock_bh+0x78/0xe0 [ 276.949989] iscsi_sw_tcp_host_get_param+0xad/0x2e0 [iscsi_tcp] [ 276.951765] show_host_param_ISCSI_HOST_PARAM_IPADDRESS+0xe9/0x130 [scsi_transport_iscsi] [ 276.952185] dev_attr_show+0x3f/0x80 [ 276.953005] sysfs_kf_seq_show+0x1fb/0x3e0 [ 276.953401] seq_read_iter+0x402/0x1020 [ 276.954260] vfs_read+0x532/0x7b0 [ 276.955113] ksys_read+0xed/0x1c0 [ 276.955952] do_syscall_64+0x38/0x90 [ 276.956347] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 276.956769] RIP: 0033:0x7f5d3a679222 [ 276.957161] Code: c0 e9 b2 fe ff ff 50 48 8d 3d 32 c0 0b 00 e8 a5 fe 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 [ 276.958009] RSP: 002b:00007ffc864d16a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 276.958431] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5d3a679222 [ 276.958857] RDX: 0000000000020000 RSI: 00007f5d3a4fe000 RDI: 0000000000000003 [ 276.959281] RBP: 00007f5d3a4fe000 R08: 00000000ffffffff R09: 0000000000000000 [ 276.959682] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000020000 [ 276.960126] R13: 0000000000000003 R14: 0000000000000000 R15: 0000557a26dada58 [ 276.960536] </TASK> [ 276.961357] Allocated by task 2209: [ 276.961756] kasan_save_stack+0x1e/0x40 [ 276.962170] kasan_set_track+0x21/0x30 [ 276.962557] __kasan_kmalloc+0x7e/0x90 [ 276.962923] __kmalloc+0x5b/0x140 [ 276.963308] iscsi_alloc_session+0x28/0x840 [scsi_transport_iscsi] [ 276.963712] iscsi_session_setup+0xda/0xba0 [libiscsi] [ 276.964078] iscsi_sw_tcp_session_create+0x1fd/0x330 [iscsi_tcp] [ 276.964431] iscsi_if_create_session.isra.0+0x50/0x260 [scsi_transport_iscsi] [ 276.964793] iscsi_if_recv_msg+0xc5a/0x2660 [scsi_transport_iscsi] [ 276.965153] iscsi_if_rx+0x198/0x4b0 [scsi_transport_iscsi] [ 276.965546] netlink_unicast+0x4d5/0x7b0 [ 276.965905] netlink_sendmsg+0x78d/0xc30 [ 276.966236] sock_sendmsg+0xe5/0x120 [ 276.966576] ____sys_sendmsg+0x5fe/0x860 [ 276.966923] ___sys_sendmsg+0xe0/0x170 [ 276.967300] __sys_sendmsg+0xc8/0x170 [ 276.967666] do_syscall_64+0x38/0x90 [ 276.968028] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 276.968773] Freed by task 2209: [ 276.969111] kasan_save_stack+0x1e/0x40 [ 276.969449] kasan_set_track+0x21/0x30 [ 276.969789] kasan_save_free_info+0x2a/0x50 [ 276.970146] __kasan_slab_free+0x106/0x190 [ 276.970470] __kmem_cache_free+0x133/0x270 [ 276.970816] device_release+0x98/0x210 [ 276.971145] kobject_cleanup+0x101/0x360 [ 276.971462] iscsi_session_teardown+0x3fb/0x530 [libiscsi] [ 276.971775] iscsi_sw_tcp_session_destroy+0xd8/0x130 [iscsi_tcp] [ 276.972143] iscsi_if_recv_msg+0x1bf1/0x2660 [scsi_transport_iscsi] [ 276.972485] iscsi_if_rx+0x198/0x4b0 [scsi_transport_iscsi] [ 276.972808] netlink_unicast+0x4d5/0x7b0 [ 276.973201] netlink_sendmsg+0x78d/0xc30 [ 276.973544] sock_sendmsg+0xe5/0x120 [ 276.973864] ____sys_sendmsg+0x5fe/0x860 [ 276.974248] ___sys_sendmsg+0xe0/0x170 [ 276.974583] __sys_sendmsg+0xc8/0x170 [ 276.974891] do_syscall_64+0x38/0x90 [ 276.975216] entry_SYSCALL_64_after_hwframe+0x63/0xcd We can easily reproduce by two tasks: 1. while :; do iscsiadm -m node --login; iscsiadm -m node --logout; done 2. while :; do cat \ /sys/devices/platform/host*/iscsi_host/host*/ipaddress; done iscsid | cat --------------------------------+--------------------------------------- |- iscsi_sw_tcp_session_destroy | |- iscsi_session_teardown | |- device_release | |- iscsi_session_release ||- dev_attr_show |- kfree | |- show_host_param_ | ISCSI_HOST_PARAM_IPADDRESS | |- iscsi_sw_tcp_host_get_param | |- r/w tcp_sw_host->session (UAF) |- iscsi_host_remove | |- iscsi_host_free | Fix the above bug by splitting the session removal into 2 parts: 1. removal from iSCSI class which includes sysfs and removal from host tracking. 2. freeing of session. During iscsi_tcp host and session removal we can remove the session from sysfs then remove the host from sysfs. At this point we know userspace is not accessing the kernel via sysfs so we can free the session and host. Link: https://lore.kernel.org/r/20230117193937.21244-2-michael.christie@oracle.com Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Acked-by: Ding Hui <dinghui@sangfor.com.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-17 22:39:36 +03:00
void iscsi_session_remove(struct iscsi_cls_session *cls_session);
void iscsi_session_free(struct iscsi_cls_session *cls_session);
extern void iscsi_session_teardown(struct iscsi_cls_session *);
extern void iscsi_session_recovery_timedout(struct iscsi_cls_session *);
extern int iscsi_set_param(struct iscsi_cls_conn *cls_conn,
enum iscsi_param param, char *buf, int buflen);
extern int iscsi_session_get_param(struct iscsi_cls_session *cls_session,
enum iscsi_param param, char *buf);
#define iscsi_session_printk(prefix, _sess, fmt, a...) \
iscsi_cls_session_printk(prefix, _sess->cls_session, fmt, ##a)
/*
* connection management
*/
extern struct iscsi_cls_conn *iscsi_conn_setup(struct iscsi_cls_session *,
int, uint32_t);
extern void iscsi_conn_teardown(struct iscsi_cls_conn *);
extern int iscsi_conn_start(struct iscsi_cls_conn *);
extern void iscsi_conn_stop(struct iscsi_cls_conn *, int);
extern int iscsi_conn_bind(struct iscsi_cls_session *, struct iscsi_cls_conn *,
int);
extern void iscsi_conn_unbind(struct iscsi_cls_conn *cls_conn, bool is_active);
extern void iscsi_conn_failure(struct iscsi_conn *conn, enum iscsi_err err);
extern void iscsi_session_failure(struct iscsi_session *session,
enum iscsi_err err);
extern int iscsi_conn_get_param(struct iscsi_cls_conn *cls_conn,
enum iscsi_param param, char *buf);
extern int iscsi_conn_get_addr_param(struct sockaddr_storage *addr,
enum iscsi_param param, char *buf);
extern void iscsi_suspend_tx(struct iscsi_conn *conn);
extern void iscsi_suspend_rx(struct iscsi_conn *conn);
extern void iscsi_suspend_queue(struct iscsi_conn *conn);
extern void iscsi_conn_queue_xmit(struct iscsi_conn *conn);
extern void iscsi_conn_queue_recv(struct iscsi_conn *conn);
#define iscsi_conn_printk(prefix, _c, fmt, a...) \
iscsi_cls_conn_printk(prefix, ((struct iscsi_conn *)_c)->cls_conn, \
fmt, ##a)
/*
* pdu and task processing
*/
extern void iscsi_update_cmdsn(struct iscsi_session *, struct iscsi_nopin *);
extern void iscsi_prep_data_out_pdu(struct iscsi_task *task,
struct iscsi_r2t_info *r2t,
struct iscsi_data *hdr);
extern int iscsi_conn_send_pdu(struct iscsi_cls_conn *, struct iscsi_hdr *,
char *, uint32_t);
extern int iscsi_complete_pdu(struct iscsi_conn *, struct iscsi_hdr *,
char *, int);
extern int __iscsi_complete_pdu(struct iscsi_conn *, struct iscsi_hdr *,
char *, int);
extern int iscsi_verify_itt(struct iscsi_conn *, itt_t);
extern struct iscsi_task *iscsi_itt_to_ctask(struct iscsi_conn *, itt_t);
extern struct iscsi_task *iscsi_itt_to_task(struct iscsi_conn *, itt_t);
extern void iscsi_requeue_task(struct iscsi_task *task);
extern void iscsi_put_task(struct iscsi_task *task);
extern void __iscsi_put_task(struct iscsi_task *task);
extern void __iscsi_get_task(struct iscsi_task *task);
extern void iscsi_complete_scsi_task(struct iscsi_task *task,
uint32_t exp_cmdsn, uint32_t max_cmdsn);
/*
* generic helpers
*/
extern void iscsi_pool_free(struct iscsi_pool *);
extern int iscsi_pool_init(struct iscsi_pool *, int, void ***, int);
extern int iscsi_switch_str_param(char **, char *);
[SCSI] iscsi_tcp, libiscsi: initial AHS Support at libiscsi generic code - currently code assumes a storage space of pdu header is allocated at llds ctask and is pointed to by iscsi_cmd_task->hdr. Here I add a hdr_max field pertaining to that storage, and an hdr_len that accumulates the current use of the pdu-header. - Add an iscsi_next_hdr() inline which returns the next free space to write new Header at. Also iscsi_next_hdr() is used to retrieve the address at which to write the header-digest. - Add iscsi_add_hdr(length). What the user do is calls iscsi_next_hdr() for address of the new header, than calls iscsi_add_hdr(length) with the size of the new header. iscsi_add_hdr() will check if space is available and update to the new size. length must be padded according to standard. - Add 2 padding inline helpers thanks to Olaf. Current patch does not use them but Following patches will. Also moved definition of ISCSI_PAD_LEN to iscsi_proto.h which had PAD_WORD_LEN that was never used anywhere. - Let iscsi_prep_scsi_cmd_pdu() signal an Error return since now it is possible that it will fail. - I was tired of yet again writing a "this is a digest" comment next to sizeof(__u32) so I defined a new ISCSI_DIGEST_SIZE. Now I don't need any comments. Changed all places that used sizeof(__u32) or "4" in connection to a digest. iscsi_tcp specific code - At struct iscsi_tcp_cmd_task allocate maximum space allowed in standard for all headers following the iscsi_cmd header. and mark it so in iscsi_tcp_session_create() - At iscsi_send_cmd_hdr() retrieve the correct headers size and write header digest at iscsi_next_hdr(). Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2007-12-13 21:43:23 +03:00
/*
* inline functions to deal with padding.
*/
static inline unsigned int
iscsi_padded(unsigned int len)
{
return (len + ISCSI_PAD_LEN - 1) & ~(ISCSI_PAD_LEN - 1);
}
static inline unsigned int
iscsi_padding(unsigned int len)
{
len &= (ISCSI_PAD_LEN - 1);
if (len)
len = ISCSI_PAD_LEN - len;
return len;
}
#endif