[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI: SCSI timeout handling change

    A very good initiative (considering the low timeouts some OSs have and the
    fact the expected behavior of the SCSI stack is abort).
    According to our phone conversation here is a summary of the changes:
       Ordered sending returns being a MUST for all cases except recovery. For
       OOO implementers may use prefetching or the mechanism already in place -
       multiple connections.  The wording in is:
       On any given connection, the iSCSI initiator MUST send the commands in
       increasing order of CmdSN except for retransmitted commands due to
       digest error recovery and connection recovery.
        Abort task - section 3.5.1 reads
    1.1.1     Function
       The Task Management functions provide an initiator with a way to
       explicitly control the execution of one or more Tasks (SCSI and iSCSI
       tasks). The Task Management functions are (for a more detailed
       description of SCSI task management see [SAM2]):
          1    ABORT TASK - aborts the task identified by the Referenced
          Task Tag field.
          2    ABORT TASK SET - aborts all Tasks issued by this initiator on
          the Logical Unit.
          3    CLEAR ACA - clears the Auto Contingent Allegiance condition.
          4    CLEAR TASK SET - Aborts all Tasks (from all initiators) for the
          Logical Unit.
          5    LOGICAL UNIT RESET
          6    TARGET WARM RESET
    8    TASK REASSIGN - reassign connection allegiance for the task identified
    by the Initiator Task Tag field on this connection, thus resuming the iSCSI
    exchanges for the task
       For all these functions, if executed, the Task Management Function
       Response MUST be returned using the Initiator Task Tag to identify the
       operation for which it is responding. All those functions apply to the
       referenced tasks regardless if they are proper SCSI tasks or tagged
       iSCSI operations.  Task management commands must be executed as if all
       the commands having a CmdSN lower or equal to the task management CmdSN
       have been received by the target (i.e., have to be executed as if
       received for ordered delivery even when marked for immediate delivery).
       For all the tasks covered by the task management response (i.e., with
       CmdSN not higher than the task management command CmdSN), additional
       responses MUST NOT be delivered to the SCSI layer after the task
       management response. This requirement implies that the initiator must
       keep around state until the status is received from the target for all
       aborted tasks and the target MUST deliver to the initiator good status
       for all aborted task for which no status was delivered yet.  The task
       management response MAY be issued by the target immediately after
       marking all tasks to be aborted.
       ABORT TASK MUST be issued on the same connection to which the task to be
       aborted is allegiant at the time the Task Management Request is issued
       if the connection is still active (it is not undergoing an implicit or
       explicit logout).  If the connection is being implicitly or explicitly
       logged out (i.e., no other request will be issued on the failing
       connection and no other response will be received on the failing
       connection) then an ABORT TASK function request may be issued on another
       connection. This Task Management request will then both establish a new
       allegiance for the command to be aborted, and abort it as well (i.e.,
       the task to be aborted will not have to be retried or reassigned, and
       its status if issued but not acknowledged will be reissued). For the
       ABORT TASK function, the target MUST NOT deliver additional responses
       after sending the task management response. In case both responses were
       delivered, whether the initiator should deliver task responses before
       delivering the task management response or not while an ABORT TASK is
       executing is a matter of implementation.  This requirement implies that
       the initiator must keep around state until the status is received from
       the target for an aborted task and the target MUST deliver to the
       initiator good status for an aborted task if no status was delivered
       yet.  The task management response MUST be issued after the command
       status (if any) was issued.
       For the LOGICAL UNIT RESET function, the target MUST behave as dictated
       by the Logical Unit Reset function in [SAM2].
       The TARGET RESET function (WARM and COLD) implementation is OPTIONAL and
       when implemented should act as described below.  Target Reset MAY be
       also subject to SCSI access controls for the requesting initiator.  When
       not implemented or when authorization fails at target, Target Reset
       functions should end as if the function was executed successfully and
       the response qualifier will detail what was executed.
       For the TARGET WARM RESET and TARGET COLD RESET functions, the target
       cancels all pending operations and are both equivalent to the Target
       Reset function specified by [SAM2].  They can both affect many other
       In addition, for the TARGET COLD RESET the target then MUST terminate
       all of its TCP connections to all initiators (all sessions are
       For the TASK REASSIGN function, the target should reassign the
       connection allegiance to this new connection (and thus resume iSCSI
       exchanges for the task).  TASK REASSIGN MUST be received by the target
       ONLY after the connection on which the command was previously executing
       has been successfully logged-out.  For additional usage semantics, see
       section 8.1.
       TASK REASSIGN MUST be issued as an immediate command.
          Section 8.6 reads
    1.1  SCSI Timeouts
       An iSCSI initiator MAY attempt to plug a command sequence gap on the
       target end (in the absence of an acknowledgement of the command by way
       of ExpCmdSN) before the ULP timeout by retrying the unacknowledged
       command, as described in section 8.1.
       On a ULP timeout for a command that carried a CmdSN of n, if the
       ExpCmdSN is still less than (n+1) on ULP timeout, the iSCSI initiator
       MUST abort the command using the Abort Task task management function
       request.  In this process, the target may see the abort request while
       missing the original command itself due to one of the following reasons:
          - the original command was dropped due to digest error, or
          - the connection the original command sent on was successfully logged
          out (on logout unacknowledged commands issued on the connection being
          logged out are discarded)
       If the abort request is received and the original command is missing,
       targets MUST consider the original command with that RefCmdSN to
       be received and issue a task management response with the response code
       "Task specified in the Referenced Task Tag field was not in task set"
       and any state referring to the aborted task (if any) at the initiator
       can be discarded.  If the original command exists, as with any abort the
       initiator expects a concluding status (that will not be delivered to
       SCSI) and the target MUST supply a status at abort time if it was not
       delivered earlier. The task management response is issued after the
                        C."                  To:     ips <>                                   
                        <cbm@rose.hp.c       cc:                                                             
                        om>                  Subject:     iSCSI: SCSI timeout handling change                
                        Sent by:                                                                             
                        13-11-01 21:21                                                                       
                        Please respond                                                                       
                        to cbm                                                                               
    Currently, if a command is not acknowledged by the ULP
    timeout, iSCSI mandates the initiators to tear up the session.
    The rationale behind this is that if the initiator could not
    get the command through (in possibly multiple retries) even
    by the ULP timeout, there's a serious problem with the session.
    But there are some drawbacks to this approach -
            - tearing up a session due to a NIC failure is
              disruptive to potentially several other active tasks
              on other NICs.
            - this puts those initiator implementations not wanting
              to do within-connection recovery (i.e. no retries) at
              a disadvantage, since one digest error would cause
              potentially several active I/Os to be terminated.
            - (albeit not very serious, ) this behavior is different
              from today's storage stacks' expectations - of being
              able to selectively abort one I/O on a timeout (with
              no command retransmissions).
    To address these issues, and also to simplify the current Task
    Management request PDU, I propose the following changes to handling
    SCSI timeouts -
    Following changes to section 3.5:
    - Abort Task MUST always be sent immediate.
    - Abort Task task management function request MUST be sent
      with its CmdSN equal to the CmdSN of the task to be aborted,
      and the Referenced Task Tag initialized to the ITT of the
      task to be aborted.
    - Consequent to the above, drop the RefCmdSN field in the
      Task Management command payload that is currently only
      used by the Abort Task function.
    Following changes to section 8.6:
    Propose the following text to replace the current -
    An iSCSI initiator MAY attempt to plug a command sequence gap on
    the target end (in the absence of an acknowledgement of the command
    by way of ExpCmdSN) before the ULP timeout by retrying the
    unacknowledged command, as described in section 8.1.
    On a ULP timeout for a command that carried a CmdSN of n, if the
    ExpCmdSN is still less than (n+1) on ULP timeout, the iSCSI initiator
    MUST abort the command using the Abort Task task management function
    request.  In this process, the target may see the abort request
    before the original command itself due to one of the three reasons -
               - the original command was dropped due to digest error, or
               - the Abort Task request was shipped out-of-order
              on the same connection, or
               - the connection the original command sent on was
              successfully logged out.
    If the abort request is received prior to the original command,
    targets MUST consider the original command with that CmdSN to
    be received and discard the original command if and when received -
    i.e. treating it as a duplicate CmdSN.  Initiators desirous of
    maintaining command ordering while maintaining the same session
    MUST NOT issue Abort Task on an unacknowledged command because
    of this reason.
    Following changes to section
    - The above approach exposes the possibility that some stale
      (aborted from target's perspective) commands could be stuck
      in the TCP connection long enough for the CmdSN wrap - similar
      to the issue we dealt with for command retries.  So, aborting
      unacknowledged commands should require the same flushing
      actions described for command retries. [ I almost would
      prefer at this point to require flushing all connections
      every 2^31 -1 commands starting from InitCmdSN, than enumerating
      these cases individually...]
    Mallikarjun Chadalapaka
    Networked Storage Architecture
    Network Storage Solutions Organization
    MS 5668         Hewlett-Packard, Roseville.


Last updated: Thu Nov 15 13:17:53 2001
7821 messages in chronological order