Re: Comments on draft-satran-iscsi-01.txt

To: cbm@rose.hp.com
Subject: Re: Comments on draft-satran-iscsi-01.txt
From: psarkar@almaden.ibm.com
Date: Thu, 27 Jul 2000 14:07:09 -0700
cc: ips@ece.cmu.edu, randy_haagens@hp.com
Content-Disposition: inline
Content-type: text/plain; charset=us-ascii
Sender: owner-ips@ece.cmu.edu



Mallikarjun and others,

I would like to voice my support for a logout mechanism.  The draft
advises us to use TCP FINs but a standard mechanism to close
channels/sessions would be preferable (generating TCP FINs
is non-trivial in some programming environs) Personally, our experience
has been that the logout mechanism has been very useful in
Fibre Channel and would be a good resource deallocation/error recovery
mechanism in iSCSI.

I had suggested a logout mechanism in iSCSI at the early stages
of the draft (Feb timeframe), for some reason it was not accepted.
Perhaps this is a good time for a discussion on the issue.

Prasenjit

   Prasenjit Sarkar
   Research Staff Member
   IBM Almaden Research
   San Jose


"Mallikarjun C." <cbm@rose.hp.com>@ece.cmu.edu on 07/27/2000 11:23:51 AM

Please respond to cbm@rose.hp.com

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:   randy_haagens@hp.com
Subject:  Comments on draft-satran-iscsi-01.txt



All:

Please find enclosed my comments on the latest draft of iSCSI
(draft-satran-iscsi-01.txt).  The first section has comments
and the second section has additional enhancements I am requesting.

Thanks.
--
Mallikarjun
cbm@rose.hp.com
Networked Storage Architecture
HP Storage Organization, Roseville.





Comments on draft-satran-iscsi-01.txt
-------------------------------------

o Section 2.1, page 4. Paraphrases SAM-2 by defining a task as a "a linked
  set of SCSI commands".  Suggest rewording as "a SCSI command, or possibly
  a linked set of SCSI commands".  Wherever applicable, the iSCSI draft
should
  also cite the corresponding SAM-2 clause numbers.

o Section 2.2.2 on page 5. First sentence states "iSCSI supports ordered
  command delivery within a session.".  While I realize that ordered
completion
  is not required by the iSCSI spec, I suggest rewording as "iSCSI supports
  ordered task initiation and completion within a session".

o I have a general question about all iSCSI PDUs from the initiator to the
  target carrying the CmdRN.  Section 3.10.5 seems to leave it as an
  implementation choice, but why is it necessary?  Once we stipulated
  task allegiance to a TCP connection and given that the SCSI Data PDU
  has the `Buffer Offset' field in it, why would the CmdRN be necessary
  for iSCSI Data PDUs from the initiator to the target?

o Section 2.2.2 on page 5.  Towards the end of the second para, "The target
  will reject any command outside this range or...." should be reworded
  to make it obvious as "The target will ignore any command outside this
  range or...".   We do not want network bandwidth to be consumed for
  something that cannot be valid.

o Section 2.2.2 on page 5.  I suggest that the spec should spell out all
  the conditions under which the CmdRN is reset back to the initial value -
  target reset (currenly forces session termination, but even if doesn't
  in future), lun reset, and re-establishment of session.

o Section 2.2.2 on page 6, second para. There's a sentence - "iSCSI
initiators
  are required to implement the numbering scheme if they support more than
  one connection."  Suggest adding "per session" at the end of the
sentence,
  as per current status of the spec.  If and when session recovery
mechanisms
  are defined, this statement may have to be modified to perhaps even
mandate
  numbering with one TCP connection, and across sessions.

o Section 2.2.2 on page 6, third para -  "iSCSI targets are not required to
  use the numbering scheme for ordered delivery".  I take it that the iSCSI
  protocol layer which provides the service delivery port abstraction to
  the device server is not required to deliver commands in the CmdRN order.
  It appears though as if the target iSCSI layer shall always keep track of
  the ExpCmdRN.  This should be stated explicitly.  Also, it appears
  that the StatRN may not be valid coming from the targets which do not
  re-order received commands by CmdRN.  It would help to state this as
well.

o This capability of a target to enforce the numbering scheme in full is
  something the initiators would be better off knowing.  I suggest adding
  a Login/Text key for this.
            EnforceOrdering: <yes | no>
            --> EnforceOrdering: <yes | no>

o Section 2.2.4 on page 7, first para.  Contains a discussion about
  connection allegiance, which limits to commands.  I suggest tasks be
  used instead.  I would also propose that the abort task management
  request should also have the same connection allegiance as the original
  command.

o Section 3.2.1 on page 14.  Autosense is made optional through a bit
  setting in the SCSI Command PDU.  Why can't it be made mandatory?
  That would make the life of the FC-iSCSI bridges a lot more easier,
  since FC mandates it.  Also, section 5.3 discusses Autosense from the
  device perspective, but stops short of saying that Autosense is mandatory
  for all iSCSI targets.  Are there any device issues complicating the
  situation here?  It appears that compliance with FC has been fairly
  in place.

o Section 3.3.4 on page 17.  Suggest adding an additional iSCSI Status
  value in the SCSI Response PDU -  "2  Non-existent iSCSI session".
  This shall be returned if a SCSI command were attempted without an
  established iSCSI login session.  Returning a Logout PDU is another
  option (see the enhancement proposal below).

o Section 3.3.7 on page 17.  It appears to be in need of more definition
  about Response data.  I propose the following response data values -
      - RTT-related: the data sent did not match the burst size, or the
               offset allowed in the RTT message.

      - SCSI Command format: invalid command format.

o Section 3.3.7 on page 17.  It appears that there's a violation of
  SAM-2 transport and application protocol layering here.  The discussion
  allows certain iSCSI (transport) errors to be indicated when the
  Command Status (SCSI application protocol) is CHECK CONDITION.  It
  would seem that all such transport errors should be able to be flagged
  with a non-zero iSCSI Status alone.  The initiators would in that case
are
  expected to look at the iSCSI status first, and proceed further only
  if it is zero.

o Sections 3.4 and 3.5 on pages 19 and 20.  Suggest adding one sentence
  each to NOP-OUT and NOP-IN sections to explicitly state the direction
  of flow of the PDU.

o Section 3.7.1 on page 23.  For the Target Reset task management
  function, the target is not expected to provide a response - and
  this is concerning to me.
       - how would the initiator confirm the successful completion of
      the target reset?

       - not having a response effectively makes the target reset
      an operation iSCSI hardware cannot assist.  Having a response
         would make it no different from the rest of the iSCSI transactions
      and the hardware can gracefully deal with it.

       - SAM-2 (section 6.6, page 63) specifies what a target should
      do "Before returning a FUNCTION COMPLETE response".  This seems
      to me as an implicit requirement that a response be returned
         on a Target Reset task management function.

       - I am also unclear as to why the sessions are allowed to be
      terminated.  In the FC world, as far as I can recall, the
      process login sessions remain intact after a target reset.
      If it is required that the sessions and the associated TCP
      connections be cleared in iSCSI, it is helpful to mandate (as
         opposed to leaving it up to the implementation) that an Async
         event shall be reported to all the initiators currently logged in.

o Section 3.8 on page 25.  Suggest additional SCSI Task Management Response
  indications in addition to the two defined.
       2   Function Invalid (the function is invalid as per current rev)
       3   Function Unsupported (valid, but implementation doesn't support)

  Also suggest adding iSCSI Status and Response data fields to the PDU.
     - iSCSI Status can take three values: success, non-existent
       LUN, and non-existent iSCSI session.
        - Response data can take one value: Invalid message format.

o Section 3.10.2 on page 29.  The Transfer Tag is the Initiator Task
  Tag in SCSI Data PDU from target to initiator.  Why then are both
  shown in the payload diagram?  If this is done to retain some resemblance
  between the two types of SCSI Data PDUs, I would suggest that ideally
  there be one type of SCSI Data PDU for both READ and WRITE - with
  certain fields in the PDU to be ignored in each case.  This makes it
  easier on hardware and software implementations.

o Section 3.11 on page 31.  Suggest adding statement "The capabilities
  exchanged and operations performed are valid for the entire login
  session including all TCP connections for that session."  This makes
  it clear, for ex., as in the case of: should authentication be performed
  on every new connection, or only on the leading connection.

o Since Text key pairs are used as part of Login process and outside,
  I advocate that they be named thus - "Text key pairs".  This terminology
  can consistently be used across.  The current usage has "Text Command
  format" (in section 3.13), and "Login/Text keys" elsewhere (section 10).

o I suggest that Login dialogue should also be able to identify the
  alternate names the same target (task manager and the device servers,
  albeit possibly with different target "id"s) is available from.
  This would mean including more Text key pairs in Appendix B.
  A possible format is -
            OtherNames: <Descriptor type,Descriptor value>

            where the Descriptor type is as defined in section 3.17.1
            and the Descriptor value is based on the given type.

o Section 3.14 on page 39.  The first paragraph allows a target to
  respond to a Login Command with a Login Response and an unsolicited
  Text Response PDU.  I suggest that there be a "Login reply" bit in
  the Text Response PDU to indicate to the initiator that the PDU is
  not in response to a Text Command, but as a reply to a login proposal.

o Section 3.14.1 on page 39.  States that the "InitStatRN is significant
  only if TSID is 0".  I am somewhat confused by this.  It appears that
  it should be non-zero.  I am assuming that the TSID is non-zero in the
  first Login Response on a connection for a given session (leading
connection),
  and is zero in all subsequent Login Responses on other connections of
  the same session.  If this were true, only the leading connection's login
  dialogue should specify InitStatRN.
  [ I see now that Mike from IBM already pointed this out, but am keeping
    this to confirm the expected target behavior that I described. ]

o Section 3.18 on page 46.  Suggest adding comments to explicitly state
  that target should reject (with a non-zero Response) the Map Command
  when a map (or unmap) of a particular TAN (or SRA) fails out of a set
  of descriptors.  Essentially, either all descriptors succeed, or all
  end up in a failure.  [ I see that Bill Main also had a comment on the
  same issue. ]

o Section 4.1 on page 49.  Second paragraph from the bottom. "if they
  are not acknowledged yet or a new CmdRN if they where acknowledged;"
  should be "if they are not acknowledged yet or a new CmdRN if they
  were acknowledged".

o Section 9.2 on page 59.  The Write operation example depicts multiple
  SCSI Data PDUs being shipped to the target in response to one RTT PDU.
  This effectively puts a requirement on the target implementations to
  keep the target transfer tag valid until the expected data size is
  received.  It would be helpful to explicitly state this in section 3.9.


New proposals for enhancements
------------------------------

o I propose a new opcode to do Third party Logout - to provide a service
  with the (almost) same name in Fibre Channel (Third Party Process Logout
  TPRLO).  This is issued by an initiator, and it requests the target to
  logout with all third party initiators who are iSCSI logged in with the
  given target.  This should be effective on all the initiators having
iSCSI
  access to the same set of task manager and device servers. This feature
  allows one host (initiator) to ensure that there there are no other hosts
  talking to a target device, in failover configurations.  In order to
  ensure that this is not maliciously used by rogue hosts, iSCSI target may
  selectively allow initiators to do TPRLO with a TPRLO-password to be
  specified in the TPRLO command PDU, this password communicated in the
  login dialogue.

o I propose adding a set of two new iSCSI opcodes for "Logout" and "Logout
  Response".
            - Logout and Logout Response enable the dual roles of initiator
              and target to be independently played by SCSI devices.
            - these can also be used as an error recovery mechanism by a
              target, forcing the initiator to re-login.
            - these also enable multiple sessions to be operational across
              a pair of SCSI devices (if and when we want it).

  This Logout is _across_ all channels associated with the iSCSI Session,
  and a graceful connection termination using TCP FINs for all the
individual
  TCP connections (other than the one Logout is delivered on) is
recommended
  before this Logout command.

o Irrespective of the exact process to handle an error, I propose that
  an upper bound be specified on the time that a SCSI device (initiator
  or target) should wait before freeing up resources allocated for a SCSI
  task (and thus assume the implicit termination of the SCSI task).  This
  timeout shall only be used in the case of a continued failure to
re-establish
  the Login session for the said period.  I know that this is a passionate
  topic, but even a ridiculously high upper bound (say 4 hours for task),
  varying on device class (disk, tape ..) is better than no upper bound.
  This is the only architected way for certain hosts (say rebooted on an
  average, once an year) to recover task resources like Tags.  Note that
  this is not an attempt to impose new timers on an iSCSI implementation,
  this only requires that the resources must be set aside for this much
time -
  one could envision an implementation which would reallocate resources
  only as needed with no timers, so the resources can be set aside far
  longer than the iSCSI spec requires.

o I propose that a new iSCSI PDU "SCSI Conf" (analogous to FCP_CONF of FC)
  be defined as a payload sent from an initiator to a target.  This informs
  the target that the initiator received the SCSI Response message on that
  Initiator Task Tag.  A target should wait to execute the subsequent
commands
  on an error until this SCSI Confirmation PDU is received.  This handshake
  "allows subsequent queued stateful operations to be performed" (taken out
  of FCP-2 spec).  In the context of tapes/asynch mirroring, this preserves
  ordering/coherency since the target device stalls for the SCSI Conf from
  the initiator.  To avoid unnecessary overhead, a target would request the
  SCSI Conf message in the SCSI Response message only on a SCSI task ending
  in an error.  Also, SCSI Conf should also play by the rules of the
connection
  allegiance.

o I suggest the following new Login/Text keys in section 10 -

         - Following are for capabilities.
             InitiatorCapability: <yes|no>
             TargetCapability: <yes|no>

         - Following for protocol revision.
             iSCSIRevision: <X.Y decimal>

           The responder to a Login command may choose to propose an
           equal or a lower rev than proposed in the Login command payload.
           If the counter-proposal in the response is not acceptable,
           the sender of Login command should immediately log out.  If the
           responder can only support higher revs, login is rejected.

         - Following, subsequent to the SCSI Conf enhancement request.
             SCSIConfSupport: <yes | no>
             --> SCSIConfSupport: <yes | no>
Prev by Date: Re: Agenda suggestion
Next by Date: RE: Agenda suggestion
Prev by thread: Comments on draft-satran-iscsi-01.txt
Next by thread: Agenda suggestion
Index(es):
- Date
- Thread
Home
Last updated: Tue Sep 04 01:08:04 2001
6315 messages in chronological order