|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Comments on draft-satran-iscsi-01.txtMallikarjun and others, I would like to voice my support for a logout mechanism. The draft advises us to use TCP FINs but a standard mechanism to close channels/sessions would be preferable (generating TCP FINs is non-trivial in some programming environs) Personally, our experience has been that the logout mechanism has been very useful in Fibre Channel and would be a good resource deallocation/error recovery mechanism in iSCSI. I had suggested a logout mechanism in iSCSI at the early stages of the draft (Feb timeframe), for some reason it was not accepted. Perhaps this is a good time for a discussion on the issue. Prasenjit Prasenjit Sarkar Research Staff Member IBM Almaden Research San Jose "Mallikarjun C." <cbm@rose.hp.com>@ece.cmu.edu on 07/27/2000 11:23:51 AM Please respond to cbm@rose.hp.com Sent by: owner-ips@ece.cmu.edu To: ips@ece.cmu.edu cc: randy_haagens@hp.com Subject: Comments on draft-satran-iscsi-01.txt All: Please find enclosed my comments on the latest draft of iSCSI (draft-satran-iscsi-01.txt). The first section has comments and the second section has additional enhancements I am requesting. Thanks. -- Mallikarjun cbm@rose.hp.com Networked Storage Architecture HP Storage Organization, Roseville. Comments on draft-satran-iscsi-01.txt ------------------------------------- o Section 2.1, page 4. Paraphrases SAM-2 by defining a task as a "a linked set of SCSI commands". Suggest rewording as "a SCSI command, or possibly a linked set of SCSI commands". Wherever applicable, the iSCSI draft should also cite the corresponding SAM-2 clause numbers. o Section 2.2.2 on page 5. First sentence states "iSCSI supports ordered command delivery within a session.". While I realize that ordered completion is not required by the iSCSI spec, I suggest rewording as "iSCSI supports ordered task initiation and completion within a session". o I have a general question about all iSCSI PDUs from the initiator to the target carrying the CmdRN. Section 3.10.5 seems to leave it as an implementation choice, but why is it necessary? Once we stipulated task allegiance to a TCP connection and given that the SCSI Data PDU has the `Buffer Offset' field in it, why would the CmdRN be necessary for iSCSI Data PDUs from the initiator to the target? o Section 2.2.2 on page 5. Towards the end of the second para, "The target will reject any command outside this range or...." should be reworded to make it obvious as "The target will ignore any command outside this range or...". We do not want network bandwidth to be consumed for something that cannot be valid. o Section 2.2.2 on page 5. I suggest that the spec should spell out all the conditions under which the CmdRN is reset back to the initial value - target reset (currenly forces session termination, but even if doesn't in future), lun reset, and re-establishment of session. o Section 2.2.2 on page 6, second para. There's a sentence - "iSCSI initiators are required to implement the numbering scheme if they support more than one connection." Suggest adding "per session" at the end of the sentence, as per current status of the spec. If and when session recovery mechanisms are defined, this statement may have to be modified to perhaps even mandate numbering with one TCP connection, and across sessions. o Section 2.2.2 on page 6, third para - "iSCSI targets are not required to use the numbering scheme for ordered delivery". I take it that the iSCSI protocol layer which provides the service delivery port abstraction to the device server is not required to deliver commands in the CmdRN order. It appears though as if the target iSCSI layer shall always keep track of the ExpCmdRN. This should be stated explicitly. Also, it appears that the StatRN may not be valid coming from the targets which do not re-order received commands by CmdRN. It would help to state this as well. o This capability of a target to enforce the numbering scheme in full is something the initiators would be better off knowing. I suggest adding a Login/Text key for this. EnforceOrdering: <yes | no> --> EnforceOrdering: <yes | no> o Section 2.2.4 on page 7, first para. Contains a discussion about connection allegiance, which limits to commands. I suggest tasks be used instead. I would also propose that the abort task management request should also have the same connection allegiance as the original command. o Section 3.2.1 on page 14. Autosense is made optional through a bit setting in the SCSI Command PDU. Why can't it be made mandatory? That would make the life of the FC-iSCSI bridges a lot more easier, since FC mandates it. Also, section 5.3 discusses Autosense from the device perspective, but stops short of saying that Autosense is mandatory for all iSCSI targets. Are there any device issues complicating the situation here? It appears that compliance with FC has been fairly in place. o Section 3.3.4 on page 17. Suggest adding an additional iSCSI Status value in the SCSI Response PDU - "2 Non-existent iSCSI session". This shall be returned if a SCSI command were attempted without an established iSCSI login session. Returning a Logout PDU is another option (see the enhancement proposal below). o Section 3.3.7 on page 17. It appears to be in need of more definition about Response data. I propose the following response data values - - RTT-related: the data sent did not match the burst size, or the offset allowed in the RTT message. - SCSI Command format: invalid command format. o Section 3.3.7 on page 17. It appears that there's a violation of SAM-2 transport and application protocol layering here. The discussion allows certain iSCSI (transport) errors to be indicated when the Command Status (SCSI application protocol) is CHECK CONDITION. It would seem that all such transport errors should be able to be flagged with a non-zero iSCSI Status alone. The initiators would in that case are expected to look at the iSCSI status first, and proceed further only if it is zero. o Sections 3.4 and 3.5 on pages 19 and 20. Suggest adding one sentence each to NOP-OUT and NOP-IN sections to explicitly state the direction of flow of the PDU. o Section 3.7.1 on page 23. For the Target Reset task management function, the target is not expected to provide a response - and this is concerning to me. - how would the initiator confirm the successful completion of the target reset? - not having a response effectively makes the target reset an operation iSCSI hardware cannot assist. Having a response would make it no different from the rest of the iSCSI transactions and the hardware can gracefully deal with it. - SAM-2 (section 6.6, page 63) specifies what a target should do "Before returning a FUNCTION COMPLETE response". This seems to me as an implicit requirement that a response be returned on a Target Reset task management function. - I am also unclear as to why the sessions are allowed to be terminated. In the FC world, as far as I can recall, the process login sessions remain intact after a target reset. If it is required that the sessions and the associated TCP connections be cleared in iSCSI, it is helpful to mandate (as opposed to leaving it up to the implementation) that an Async event shall be reported to all the initiators currently logged in. o Section 3.8 on page 25. Suggest additional SCSI Task Management Response indications in addition to the two defined. 2 Function Invalid (the function is invalid as per current rev) 3 Function Unsupported (valid, but implementation doesn't support) Also suggest adding iSCSI Status and Response data fields to the PDU. - iSCSI Status can take three values: success, non-existent LUN, and non-existent iSCSI session. - Response data can take one value: Invalid message format. o Section 3.10.2 on page 29. The Transfer Tag is the Initiator Task Tag in SCSI Data PDU from target to initiator. Why then are both shown in the payload diagram? If this is done to retain some resemblance between the two types of SCSI Data PDUs, I would suggest that ideally there be one type of SCSI Data PDU for both READ and WRITE - with certain fields in the PDU to be ignored in each case. This makes it easier on hardware and software implementations. o Section 3.11 on page 31. Suggest adding statement "The capabilities exchanged and operations performed are valid for the entire login session including all TCP connections for that session." This makes it clear, for ex., as in the case of: should authentication be performed on every new connection, or only on the leading connection. o Since Text key pairs are used as part of Login process and outside, I advocate that they be named thus - "Text key pairs". This terminology can consistently be used across. The current usage has "Text Command format" (in section 3.13), and "Login/Text keys" elsewhere (section 10). o I suggest that Login dialogue should also be able to identify the alternate names the same target (task manager and the device servers, albeit possibly with different target "id"s) is available from. This would mean including more Text key pairs in Appendix B. A possible format is - OtherNames: <Descriptor type,Descriptor value> where the Descriptor type is as defined in section 3.17.1 and the Descriptor value is based on the given type. o Section 3.14 on page 39. The first paragraph allows a target to respond to a Login Command with a Login Response and an unsolicited Text Response PDU. I suggest that there be a "Login reply" bit in the Text Response PDU to indicate to the initiator that the PDU is not in response to a Text Command, but as a reply to a login proposal. o Section 3.14.1 on page 39. States that the "InitStatRN is significant only if TSID is 0". I am somewhat confused by this. It appears that it should be non-zero. I am assuming that the TSID is non-zero in the first Login Response on a connection for a given session (leading connection), and is zero in all subsequent Login Responses on other connections of the same session. If this were true, only the leading connection's login dialogue should specify InitStatRN. [ I see now that Mike from IBM already pointed this out, but am keeping this to confirm the expected target behavior that I described. ] o Section 3.18 on page 46. Suggest adding comments to explicitly state that target should reject (with a non-zero Response) the Map Command when a map (or unmap) of a particular TAN (or SRA) fails out of a set of descriptors. Essentially, either all descriptors succeed, or all end up in a failure. [ I see that Bill Main also had a comment on the same issue. ] o Section 4.1 on page 49. Second paragraph from the bottom. "if they are not acknowledged yet or a new CmdRN if they where acknowledged;" should be "if they are not acknowledged yet or a new CmdRN if they were acknowledged". o Section 9.2 on page 59. The Write operation example depicts multiple SCSI Data PDUs being shipped to the target in response to one RTT PDU. This effectively puts a requirement on the target implementations to keep the target transfer tag valid until the expected data size is received. It would be helpful to explicitly state this in section 3.9. New proposals for enhancements ------------------------------ o I propose a new opcode to do Third party Logout - to provide a service with the (almost) same name in Fibre Channel (Third Party Process Logout TPRLO). This is issued by an initiator, and it requests the target to logout with all third party initiators who are iSCSI logged in with the given target. This should be effective on all the initiators having iSCSI access to the same set of task manager and device servers. This feature allows one host (initiator) to ensure that there there are no other hosts talking to a target device, in failover configurations. In order to ensure that this is not maliciously used by rogue hosts, iSCSI target may selectively allow initiators to do TPRLO with a TPRLO-password to be specified in the TPRLO command PDU, this password communicated in the login dialogue. o I propose adding a set of two new iSCSI opcodes for "Logout" and "Logout Response". - Logout and Logout Response enable the dual roles of initiator and target to be independently played by SCSI devices. - these can also be used as an error recovery mechanism by a target, forcing the initiator to re-login. - these also enable multiple sessions to be operational across a pair of SCSI devices (if and when we want it). This Logout is _across_ all channels associated with the iSCSI Session, and a graceful connection termination using TCP FINs for all the individual TCP connections (other than the one Logout is delivered on) is recommended before this Logout command. o Irrespective of the exact process to handle an error, I propose that an upper bound be specified on the time that a SCSI device (initiator or target) should wait before freeing up resources allocated for a SCSI task (and thus assume the implicit termination of the SCSI task). This timeout shall only be used in the case of a continued failure to re-establish the Login session for the said period. I know that this is a passionate topic, but even a ridiculously high upper bound (say 4 hours for task), varying on device class (disk, tape ..) is better than no upper bound. This is the only architected way for certain hosts (say rebooted on an average, once an year) to recover task resources like Tags. Note that this is not an attempt to impose new timers on an iSCSI implementation, this only requires that the resources must be set aside for this much time - one could envision an implementation which would reallocate resources only as needed with no timers, so the resources can be set aside far longer than the iSCSI spec requires. o I propose that a new iSCSI PDU "SCSI Conf" (analogous to FCP_CONF of FC) be defined as a payload sent from an initiator to a target. This informs the target that the initiator received the SCSI Response message on that Initiator Task Tag. A target should wait to execute the subsequent commands on an error until this SCSI Confirmation PDU is received. This handshake "allows subsequent queued stateful operations to be performed" (taken out of FCP-2 spec). In the context of tapes/asynch mirroring, this preserves ordering/coherency since the target device stalls for the SCSI Conf from the initiator. To avoid unnecessary overhead, a target would request the SCSI Conf message in the SCSI Response message only on a SCSI task ending in an error. Also, SCSI Conf should also play by the rules of the connection allegiance. o I suggest the following new Login/Text keys in section 10 - - Following are for capabilities. InitiatorCapability: <yes|no> TargetCapability: <yes|no> - Following for protocol revision. iSCSIRevision: <X.Y decimal> The responder to a Login command may choose to propose an equal or a lower rev than proposed in the Login command payload. If the counter-proposal in the response is not acceptable, the sender of Login command should immediately log out. If the responder can only support higher revs, login is rejected. - Following, subsequent to the SCSI Conf enhancement request. SCSIConfSupport: <yes | no> --> SCSIConfSupport: <yes | no>
Home Last updated: Tue Sep 04 01:08:04 2001 6315 messages in chronological order |