scsi: lpfc: Correct race with abort on completion path

On io completion, the driver is taking an adapter wide lock and nulling the
scsi command back pointer.  The nulling of the back pointer is to signify the
io was completed and the scsi_done() routine was called.  However, the routine
makes no check to see if the abort routine had done the same thing and
possibly nulled the pointer. Thus it may doubly-complete the io.

Make the following mods:

- Check to make sure forward progress (call scsi_done()) only happens if the
  command pointer was non-null.

- As the taking of the lock, which is adapter wide, is very costly on a system
  under load, null the pointer using an xchg operation rather than under lock.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This commit is contained in:
James Smart 2018-09-10 10:30:44 -07:00 коммит произвёл Martin K. Petersen
Родитель faf0a5f829
Коммит ca7fb76e09
1 изменённых файлов: 11 добавлений и 3 удалений

Просмотреть файл

@ -4158,9 +4158,17 @@ lpfc_scsi_cmd_iocb_cmpl(struct lpfc_hba *phba, struct lpfc_iocbq *pIocbIn,
}
lpfc_scsi_unprep_dma_buf(phba, lpfc_cmd);
spin_lock_irqsave(&phba->hbalock, flags);
lpfc_cmd->pCmd = NULL;
spin_unlock_irqrestore(&phba->hbalock, flags);
/* If pCmd was set to NULL from abort path, do not call scsi_done */
if (xchg(&lpfc_cmd->pCmd, NULL) == NULL) {
lpfc_printf_vlog(vport, KERN_INFO, LOG_FCP,
"0711 FCP cmd already NULL, sid: 0x%06x, "
"did: 0x%06x, oxid: 0x%04x\n",
vport->fc_myDID,
(pnode) ? pnode->nlp_DID : 0,
phba->sli_rev == LPFC_SLI_REV4 ?
lpfc_cmd->cur_iocbq.sli4_xritag : 0xffff);
return;
}
/* The sdev is not guaranteed to be valid post scsi_done upcall. */
cmd->scsi_done(cmd);