Re: iSCSI: SCSI timeout handling change

To: ips@ece.cmu.edu
Subject: Re: iSCSI: SCSI timeout handling change
From: "Julian Satran" <Julian_Satran@il.ibm.com>
Date: Thu, 15 Nov 2001 08:28:51 +0200
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu
Mallikarjun,

A very good initiative (considering the low timeouts some OSs have and the
fact the expected behavior of the SCSI stack is abort).
According to our phone conversation here is a summary of the changes:

   Ordered sending returns being a MUST for all cases except recovery. For
   OOO implementers may use prefetching or the mechanism already in place -
   multiple connections.  The wording in 2.2.2.1 is:

   On any given connection, the iSCSI initiator MUST send the commands in
   increasing order of CmdSN except for retransmitted commands due to
   digest error recovery and connection recovery.

    Abort task - section 3.5.1 reads
1.1.1     Function

   The Task Management functions provide an initiator with a way to
   explicitly control the execution of one or more Tasks (SCSI and iSCSI
   tasks). The Task Management functions are (for a more detailed
   description of SCSI task management see [SAM2]):

      1    ABORT TASK - aborts the task identified by the Referenced
      Task Tag field.
      2    ABORT TASK SET - aborts all Tasks issued by this initiator on
      the Logical Unit.
      3    CLEAR ACA - clears the Auto Contingent Allegiance condition.
      4    CLEAR TASK SET - Aborts all Tasks (from all initiators) for the
      Logical Unit.
      5    LOGICAL UNIT RESET
      6    TARGET WARM RESET
7    TARGET COLD RESET
8    TASK REASSIGN - reassign connection allegiance for the task identified
by the Initiator Task Tag field on this connection, thus resuming the iSCSI
exchanges for the task

   For all these functions, if executed, the Task Management Function
   Response MUST be returned using the Initiator Task Tag to identify the
   operation for which it is responding. All those functions apply to the
   referenced tasks regardless if they are proper SCSI tasks or tagged
   iSCSI operations.  Task management commands must be executed as if all
   the commands having a CmdSN lower or equal to the task management CmdSN
   have been received by the target (i.e., have to be executed as if
   received for ordered delivery even when marked for immediate delivery).
   For all the tasks covered by the task management response (i.e., with
   CmdSN not higher than the task management command CmdSN), additional
   responses MUST NOT be delivered to the SCSI layer after the task
   management response. This requirement implies that the initiator must
   keep around state until the status is received from the target for all
   aborted tasks and the target MUST deliver to the initiator good status
   for all aborted task for which no status was delivered yet.  The task
   management response MAY be issued by the target immediately after
   marking all tasks to be aborted.

   ABORT TASK MUST be issued on the same connection to which the task to be
   aborted is allegiant at the time the Task Management Request is issued
   if the connection is still active (it is not undergoing an implicit or
   explicit logout).  If the connection is being implicitly or explicitly
   logged out (i.e., no other request will be issued on the failing
   connection and no other response will be received on the failing
   connection) then an ABORT TASK function request may be issued on another
   connection. This Task Management request will then both establish a new
   allegiance for the command to be aborted, and abort it as well (i.e.,
   the task to be aborted will not have to be retried or reassigned, and
   its status if issued but not acknowledged will be reissued). For the
   ABORT TASK function, the target MUST NOT deliver additional responses
   after sending the task management response. In case both responses were
   delivered, whether the initiator should deliver task responses before
   delivering the task management response or not while an ABORT TASK is
   executing is a matter of implementation.  This requirement implies that
   the initiator must keep around state until the status is received from
   the target for an aborted task and the target MUST deliver to the
   initiator good status for an aborted task if no status was delivered
   yet.  The task management response MUST be issued after the command
   status (if any) was issued.

   For the LOGICAL UNIT RESET function, the target MUST behave as dictated
   by the Logical Unit Reset function in [SAM2].

   The TARGET RESET function (WARM and COLD) implementation is OPTIONAL and
   when implemented should act as described below.  Target Reset MAY be
   also subject to SCSI access controls for the requesting initiator.  When
   not implemented or when authorization fails at target, Target Reset
   functions should end as if the function was executed successfully and
   the response qualifier will detail what was executed.

   For the TARGET WARM RESET and TARGET COLD RESET functions, the target
   cancels all pending operations and are both equivalent to the Target
   Reset function specified by [SAM2].  They can both affect many other
   initiators.

   In addition, for the TARGET COLD RESET the target then MUST terminate
   all of its TCP connections to all initiators (all sessions are
   terminated).

   For the TASK REASSIGN function, the target should reassign the
   connection allegiance to this new connection (and thus resume iSCSI
   exchanges for the task).  TASK REASSIGN MUST be received by the target
   ONLY after the connection on which the command was previously executing
   has been successfully logged-out.  For additional usage semantics, see
   section 8.1.


   TASK REASSIGN MUST be issued as an immediate command.

      Section 8.6 reads

1.1  SCSI Timeouts

   An iSCSI initiator MAY attempt to plug a command sequence gap on the
   target end (in the absence of an acknowledgement of the command by way
   of ExpCmdSN) before the ULP timeout by retrying the unacknowledged
   command, as described in section 8.1.

   On a ULP timeout for a command that carried a CmdSN of n, if the
   ExpCmdSN is still less than (n+1) on ULP timeout, the iSCSI initiator
   MUST abort the command using the Abort Task task management function
   request.  In this process, the target may see the abort request while
   missing the original command itself due to one of the following reasons:

      - the original command was dropped due to digest error, or
      - the connection the original command sent on was successfully logged
      out (on logout unacknowledged commands issued on the connection being
      logged out are discarded)

   If the abort request is received and the original command is missing,
   targets MUST consider the original command with that RefCmdSN to
   be received and issue a task management response with the response code
   "Task specified in the Referenced Task Tag field was not in task set"
   and any state referring to the aborted task (if any) at the initiator
   can be discarded.  If the original command exists, as with any abort the
   initiator expects a concluding status (that will not be delivered to
   SCSI) and the target MUST supply a status at abort time if it was not
   delivered earlier. The task management response is issued after the
   status.

   Julo


                                                                                                         
                    "Mallikarjun                                                                         
                    C."                  To:     ips <ips@ece.cmu.edu>                                   
                    <cbm@rose.hp.c       cc:                                                             
                    om>                  Subject:     iSCSI: SCSI timeout handling change                
                    Sent by:                                                                             
                    owner-ips@ece.                                                                       
                    cmu.edu                                                                              
                                                                                                         
                                                                                                         
                    13-11-01 21:21                                                                       
                    Please respond                                                                       
                    to cbm                                                                               
                                                                                                         
                                                                                                         



All:

Currently, if a command is not acknowledged by the ULP
timeout, iSCSI mandates the initiators to tear up the session.
The rationale behind this is that if the initiator could not
get the command through (in possibly multiple retries) even
by the ULP timeout, there's a serious problem with the session.
But there are some drawbacks to this approach -

        - tearing up a session due to a NIC failure is
          disruptive to potentially several other active tasks
          on other NICs.
        - this puts those initiator implementations not wanting
          to do within-connection recovery (i.e. no retries) at
          a disadvantage, since one digest error would cause
          potentially several active I/Os to be terminated.
        - (albeit not very serious, ) this behavior is different
          from today's storage stacks' expectations - of being
          able to selectively abort one I/O on a timeout (with
          no command retransmissions).

To address these issues, and also to simplify the current Task
Management request PDU, I propose the following changes to handling
SCSI timeouts -

Following changes to section 3.5:

- Abort Task MUST always be sent immediate.

- Abort Task task management function request MUST be sent
  with its CmdSN equal to the CmdSN of the task to be aborted,
  and the Referenced Task Tag initialized to the ITT of the
  task to be aborted.

- Consequent to the above, drop the RefCmdSN field in the
  Task Management command payload that is currently only
  used by the Abort Task function.

Following changes to section 8.6:

Propose the following text to replace the current -

An iSCSI initiator MAY attempt to plug a command sequence gap on
the target end (in the absence of an acknowledgement of the command
by way of ExpCmdSN) before the ULP timeout by retrying the
unacknowledged command, as described in section 8.1.

On a ULP timeout for a command that carried a CmdSN of n, if the
ExpCmdSN is still less than (n+1) on ULP timeout, the iSCSI initiator
MUST abort the command using the Abort Task task management function
request.  In this process, the target may see the abort request
before the original command itself due to one of the three reasons -
           - the original command was dropped due to digest error, or
           - the Abort Task request was shipped out-of-order
          on the same connection, or
           - the connection the original command sent on was
          successfully logged out.

If the abort request is received prior to the original command,
targets MUST consider the original command with that CmdSN to
be received and discard the original command if and when received -
i.e. treating it as a duplicate CmdSN.  Initiators desirous of
maintaining command ordering while maintaining the same session
MUST NOT issue Abort Task on an unacknowledged command because
of this reason.

Following changes to section 2.2.2.1:
- The above approach exposes the possibility that some stale
  (aborted from target's perspective) commands could be stuck
  in the TCP connection long enough for the CmdSN wrap - similar
  to the issue we dealt with for command retries.  So, aborting
  unacknowledged commands should require the same flushing
  actions described for command retries. [ I almost would
  prefer at this point to require flushing all connections
  every 2^31 -1 commands starting from InitCmdSN, than enumerating
  these cases individually...]

Comments?
--
Mallikarjun


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668         Hewlett-Packard, Roseville.
cbm@rose.hp.com
Prev by Date: RE: FCencap: Proposed changes for -04
Next by Date: RE: iSCSI: security questions
Prev by thread: iSCSI: SCSI timeout handling change
Next by thread: iSCSI: data and data sequences for Read
Index(es):
- Date
- Thread
Home
Last updated: Thu Nov 15 13:17:53 2001
7821 messages in chronological order