|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Comments on draft-satran-iscsi-01.txt
Mallikarjun and others,
I would like to voice my support for a logout mechanism. The draft
advises us to use TCP FINs but a standard mechanism to close
channels/sessions would be preferable (generating TCP FINs
is non-trivial in some programming environs) Personally, our experience
has been that the logout mechanism has been very useful in
Fibre Channel and would be a good resource deallocation/error recovery
mechanism in iSCSI.
I had suggested a logout mechanism in iSCSI at the early stages
of the draft (Feb timeframe), for some reason it was not accepted.
Perhaps this is a good time for a discussion on the issue.
Prasenjit
Prasenjit Sarkar
Research Staff Member
IBM Almaden Research
San Jose
"Mallikarjun C." <cbm@rose.hp.com>@ece.cmu.edu on 07/27/2000 11:23:51 AM
Please respond to cbm@rose.hp.com
Sent by: owner-ips@ece.cmu.edu
To: ips@ece.cmu.edu
cc: randy_haagens@hp.com
Subject: Comments on draft-satran-iscsi-01.txt
All:
Please find enclosed my comments on the latest draft of iSCSI
(draft-satran-iscsi-01.txt). The first section has comments
and the second section has additional enhancements I am requesting.
Thanks.
--
Mallikarjun
cbm@rose.hp.com
Networked Storage Architecture
HP Storage Organization, Roseville.
Comments on draft-satran-iscsi-01.txt
-------------------------------------
o Section 2.1, page 4. Paraphrases SAM-2 by defining a task as a "a linked
set of SCSI commands". Suggest rewording as "a SCSI command, or possibly
a linked set of SCSI commands". Wherever applicable, the iSCSI draft
should
also cite the corresponding SAM-2 clause numbers.
o Section 2.2.2 on page 5. First sentence states "iSCSI supports ordered
command delivery within a session.". While I realize that ordered
completion
is not required by the iSCSI spec, I suggest rewording as "iSCSI supports
ordered task initiation and completion within a session".
o I have a general question about all iSCSI PDUs from the initiator to the
target carrying the CmdRN. Section 3.10.5 seems to leave it as an
implementation choice, but why is it necessary? Once we stipulated
task allegiance to a TCP connection and given that the SCSI Data PDU
has the `Buffer Offset' field in it, why would the CmdRN be necessary
for iSCSI Data PDUs from the initiator to the target?
o Section 2.2.2 on page 5. Towards the end of the second para, "The target
will reject any command outside this range or...." should be reworded
to make it obvious as "The target will ignore any command outside this
range or...". We do not want network bandwidth to be consumed for
something that cannot be valid.
o Section 2.2.2 on page 5. I suggest that the spec should spell out all
the conditions under which the CmdRN is reset back to the initial value -
target reset (currenly forces session termination, but even if doesn't
in future), lun reset, and re-establishment of session.
o Section 2.2.2 on page 6, second para. There's a sentence - "iSCSI
initiators
are required to implement the numbering scheme if they support more than
one connection." Suggest adding "per session" at the end of the
sentence,
as per current status of the spec. If and when session recovery
mechanisms
are defined, this statement may have to be modified to perhaps even
mandate
numbering with one TCP connection, and across sessions.
o Section 2.2.2 on page 6, third para - "iSCSI targets are not required to
use the numbering scheme for ordered delivery". I take it that the iSCSI
protocol layer which provides the service delivery port abstraction to
the device server is not required to deliver commands in the CmdRN order.
It appears though as if the target iSCSI layer shall always keep track of
the ExpCmdRN. This should be stated explicitly. Also, it appears
that the StatRN may not be valid coming from the targets which do not
re-order received commands by CmdRN. It would help to state this as
well.
o This capability of a target to enforce the numbering scheme in full is
something the initiators would be better off knowing. I suggest adding
a Login/Text key for this.
EnforceOrdering: <yes | no>
--> EnforceOrdering: <yes | no>
o Section 2.2.4 on page 7, first para. Contains a discussion about
connection allegiance, which limits to commands. I suggest tasks be
used instead. I would also propose that the abort task management
request should also have the same connection allegiance as the original
command.
o Section 3.2.1 on page 14. Autosense is made optional through a bit
setting in the SCSI Command PDU. Why can't it be made mandatory?
That would make the life of the FC-iSCSI bridges a lot more easier,
since FC mandates it. Also, section 5.3 discusses Autosense from the
device perspective, but stops short of saying that Autosense is mandatory
for all iSCSI targets. Are there any device issues complicating the
situation here? It appears that compliance with FC has been fairly
in place.
o Section 3.3.4 on page 17. Suggest adding an additional iSCSI Status
value in the SCSI Response PDU - "2 Non-existent iSCSI session".
This shall be returned if a SCSI command were attempted without an
established iSCSI login session. Returning a Logout PDU is another
option (see the enhancement proposal below).
o Section 3.3.7 on page 17. It appears to be in need of more definition
about Response data. I propose the following response data values -
- RTT-related: the data sent did not match the burst size, or the
offset allowed in the RTT message.
- SCSI Command format: invalid command format.
o Section 3.3.7 on page 17. It appears that there's a violation of
SAM-2 transport and application protocol layering here. The discussion
allows certain iSCSI (transport) errors to be indicated when the
Command Status (SCSI application protocol) is CHECK CONDITION. It
would seem that all such transport errors should be able to be flagged
with a non-zero iSCSI Status alone. The initiators would in that case
are
expected to look at the iSCSI status first, and proceed further only
if it is zero.
o Sections 3.4 and 3.5 on pages 19 and 20. Suggest adding one sentence
each to NOP-OUT and NOP-IN sections to explicitly state the direction
of flow of the PDU.
o Section 3.7.1 on page 23. For the Target Reset task management
function, the target is not expected to provide a response - and
this is concerning to me.
- how would the initiator confirm the successful completion of
the target reset?
- not having a response effectively makes the target reset
an operation iSCSI hardware cannot assist. Having a response
would make it no different from the rest of the iSCSI transactions
and the hardware can gracefully deal with it.
- SAM-2 (section 6.6, page 63) specifies what a target should
do "Before returning a FUNCTION COMPLETE response". This seems
to me as an implicit requirement that a response be returned
on a Target Reset task management function.
- I am also unclear as to why the sessions are allowed to be
terminated. In the FC world, as far as I can recall, the
process login sessions remain intact after a target reset.
If it is required that the sessions and the associated TCP
connections be cleared in iSCSI, it is helpful to mandate (as
opposed to leaving it up to the implementation) that an Async
event shall be reported to all the initiators currently logged in.
o Section 3.8 on page 25. Suggest additional SCSI Task Management Response
indications in addition to the two defined.
2 Function Invalid (the function is invalid as per current rev)
3 Function Unsupported (valid, but implementation doesn't support)
Also suggest adding iSCSI Status and Response data fields to the PDU.
- iSCSI Status can take three values: success, non-existent
LUN, and non-existent iSCSI session.
- Response data can take one value: Invalid message format.
o Section 3.10.2 on page 29. The Transfer Tag is the Initiator Task
Tag in SCSI Data PDU from target to initiator. Why then are both
shown in the payload diagram? If this is done to retain some resemblance
between the two types of SCSI Data PDUs, I would suggest that ideally
there be one type of SCSI Data PDU for both READ and WRITE - with
certain fields in the PDU to be ignored in each case. This makes it
easier on hardware and software implementations.
o Section 3.11 on page 31. Suggest adding statement "The capabilities
exchanged and operations performed are valid for the entire login
session including all TCP connections for that session." This makes
it clear, for ex., as in the case of: should authentication be performed
on every new connection, or only on the leading connection.
o Since Text key pairs are used as part of Login process and outside,
I advocate that they be named thus - "Text key pairs". This terminology
can consistently be used across. The current usage has "Text Command
format" (in section 3.13), and "Login/Text keys" elsewhere (section 10).
o I suggest that Login dialogue should also be able to identify the
alternate names the same target (task manager and the device servers,
albeit possibly with different target "id"s) is available from.
This would mean including more Text key pairs in Appendix B.
A possible format is -
OtherNames: <Descriptor type,Descriptor value>
where the Descriptor type is as defined in section 3.17.1
and the Descriptor value is based on the given type.
o Section 3.14 on page 39. The first paragraph allows a target to
respond to a Login Command with a Login Response and an unsolicited
Text Response PDU. I suggest that there be a "Login reply" bit in
the Text Response PDU to indicate to the initiator that the PDU is
not in response to a Text Command, but as a reply to a login proposal.
o Section 3.14.1 on page 39. States that the "InitStatRN is significant
only if TSID is 0". I am somewhat confused by this. It appears that
it should be non-zero. I am assuming that the TSID is non-zero in the
first Login Response on a connection for a given session (leading
connection),
and is zero in all subsequent Login Responses on other connections of
the same session. If this were true, only the leading connection's login
dialogue should specify InitStatRN.
[ I see now that Mike from IBM already pointed this out, but am keeping
this to confirm the expected target behavior that I described. ]
o Section 3.18 on page 46. Suggest adding comments to explicitly state
that target should reject (with a non-zero Response) the Map Command
when a map (or unmap) of a particular TAN (or SRA) fails out of a set
of descriptors. Essentially, either all descriptors succeed, or all
end up in a failure. [ I see that Bill Main also had a comment on the
same issue. ]
o Section 4.1 on page 49. Second paragraph from the bottom. "if they
are not acknowledged yet or a new CmdRN if they where acknowledged;"
should be "if they are not acknowledged yet or a new CmdRN if they
were acknowledged".
o Section 9.2 on page 59. The Write operation example depicts multiple
SCSI Data PDUs being shipped to the target in response to one RTT PDU.
This effectively puts a requirement on the target implementations to
keep the target transfer tag valid until the expected data size is
received. It would be helpful to explicitly state this in section 3.9.
New proposals for enhancements
------------------------------
o I propose a new opcode to do Third party Logout - to provide a service
with the (almost) same name in Fibre Channel (Third Party Process Logout
TPRLO). This is issued by an initiator, and it requests the target to
logout with all third party initiators who are iSCSI logged in with the
given target. This should be effective on all the initiators having
iSCSI
access to the same set of task manager and device servers. This feature
allows one host (initiator) to ensure that there there are no other hosts
talking to a target device, in failover configurations. In order to
ensure that this is not maliciously used by rogue hosts, iSCSI target may
selectively allow initiators to do TPRLO with a TPRLO-password to be
specified in the TPRLO command PDU, this password communicated in the
login dialogue.
o I propose adding a set of two new iSCSI opcodes for "Logout" and "Logout
Response".
- Logout and Logout Response enable the dual roles of initiator
and target to be independently played by SCSI devices.
- these can also be used as an error recovery mechanism by a
target, forcing the initiator to re-login.
- these also enable multiple sessions to be operational across
a pair of SCSI devices (if and when we want it).
This Logout is _across_ all channels associated with the iSCSI Session,
and a graceful connection termination using TCP FINs for all the
individual
TCP connections (other than the one Logout is delivered on) is
recommended
before this Logout command.
o Irrespective of the exact process to handle an error, I propose that
an upper bound be specified on the time that a SCSI device (initiator
or target) should wait before freeing up resources allocated for a SCSI
task (and thus assume the implicit termination of the SCSI task). This
timeout shall only be used in the case of a continued failure to
re-establish
the Login session for the said period. I know that this is a passionate
topic, but even a ridiculously high upper bound (say 4 hours for task),
varying on device class (disk, tape ..) is better than no upper bound.
This is the only architected way for certain hosts (say rebooted on an
average, once an year) to recover task resources like Tags. Note that
this is not an attempt to impose new timers on an iSCSI implementation,
this only requires that the resources must be set aside for this much
time -
one could envision an implementation which would reallocate resources
only as needed with no timers, so the resources can be set aside far
longer than the iSCSI spec requires.
o I propose that a new iSCSI PDU "SCSI Conf" (analogous to FCP_CONF of FC)
be defined as a payload sent from an initiator to a target. This informs
the target that the initiator received the SCSI Response message on that
Initiator Task Tag. A target should wait to execute the subsequent
commands
on an error until this SCSI Confirmation PDU is received. This handshake
"allows subsequent queued stateful operations to be performed" (taken out
of FCP-2 spec). In the context of tapes/asynch mirroring, this preserves
ordering/coherency since the target device stalls for the SCSI Conf from
the initiator. To avoid unnecessary overhead, a target would request the
SCSI Conf message in the SCSI Response message only on a SCSI task ending
in an error. Also, SCSI Conf should also play by the rules of the
connection
allegiance.
o I suggest the following new Login/Text keys in section 10 -
- Following are for capabilities.
InitiatorCapability: <yes|no>
TargetCapability: <yes|no>
- Following for protocol revision.
iSCSIRevision: <X.Y decimal>
The responder to a Login command may choose to propose an
equal or a lower rev than proposed in the Login command payload.
If the counter-proposal in the response is not acceptable,
the sender of Login command should immediately log out. If the
responder can only support higher revs, login is rejected.
- Following, subsequent to the SCSI Conf enhancement request.
SCSIConfSupport: <yes | no>
--> SCSIConfSupport: <yes | no>
Home Last updated: Tue Sep 04 01:08:04 2001 6315 messages in chronological order |