|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Comments to Comments!Jack, Thanks for your attention and detailed comments. I sincerely hope that we could work all together to get to a better standard. And here are our thoughts as expressed by several authors: --- start forwarded message by harwood, jack --- > From: "harwood, jack" <harwood_jack@emc.com> > To: ips@ece.cmu.edu > Subject: Comments on the current iSCSI draft > Date: Fri, 10 Mar 2000 20:26:05 -0500 ... > Architectural > ------------- > * There is an issue with the separation of the Control and Data Channel. NAT > (address translation), firewall, or load balancing products will not support > iSCSI without changes which in turn is a barrier to adoption for large > networks. If the goal is to provide interleaving of control commands with > large data transfers we feel this can be accomplished in other ways. > - Use smaller data frames to allow better interleaving of control and > data on a single connection > - Use multiple connections between the same source and destination pair > where each connection is independent of other connections > (i.e., data/control are combined on each connection). > Separation of control and data also adds new failure modes where one channel > closes but the other does not. True, separating data from control introduces some new problems that could be avoided if we interleave. We briefly considered such a design aiming at one TCP connection per LUN. But this is inordinately expensive. If we multiplex LUNs (as we do in the current draft) keeping to a short TCP frame will leave as open to all sorts of troubles (possible deadlocks) due to the limited TCP window and our lack of control over the data source and sink. Separating the control and data stream we could resort to selective resets to get out of trouble - while with a common connection we might have to resort to radical means (e.g., closing connections). In addition in a "permissive" environment (like a video server) we might require CRC on the control connection while leaving the data connections up to the user. It is a bit more difficult to implement but worth the trouble. > * The use of DNS addressing in the protocol as described in sections 3.13, > Open Data Connection, and section 3.17, Third Party Copy, will force all > parties to depend on DNS in order for the protocol to work. While system and > network administrators should be free to make this choice (and invest the > effort in making DNS suitably robust), this protocol design should NOT be > based on the assumption that DNS is a robust highly available service. The > protocol should be based on IP addresses. It is true that the system recommends using DNS. However, the administrator is free to choose names such as "123.45.67.89" and the initiators and targets will interpret that as IPv4 (or IPv6) as necessary. It was felt that we should be completely independent of IP addresses because of firewall and IP masquerading issues with setting up new TCP connections. IP addresses /can/ be used, but only in dotted decimal notation. Note that no addresses need be provided for simple systems, and all key:value pairs can be safely ignored by the target. > Conceptual > ---------- > * The iSCSI protocol requires a strong authentication mechanism. In its > current form, without an implementation and corresponding specification, it > is impossible to write an interoperable authentication implementation from > the document as it stands, hence at least one strong authentication > mechanism must be mapped onto the protocol, possibly in a separate document > or documents. Correct. We decided to make a flexible framework for authentication, rather than specify a particular method. Specific authentication schemes could be described in other documents. We briefly considered (and are not outright rejecting) other schemes - most notably the one used in SST (SCSI over ST) in which in fact the connection can go through 3 stages - Idle - Authenticating - Active. 1 bit in the login indicates if the authentication is required and gets the state machines in either the Authenticating stage or the Active stage. The standard does not address how you go from authenticating to Active. This design enables non-authenticating machines to interoperate and leaves open the whole authentication process to other standards. We felt that we have to have a minimal authentication specifies at least to avoid "good faith" mistakes but we are open to discuss this in the working group at some length. > * The parameter negotiation, described in sections 3.9-12, is very general. > The free-form text/value format will cost code to parse and may not be > justified. We designed the system so that any non-responses to TEXT commands are considered as not supported. On targets or initiators where text:value is too complex, a set of defaults should be chosen and no TEXT commands supported. For targets, the MODE SELECT can set SCSI-like things. The TEXT command covers Network-like things. > * The action of killing all outstanding IOs on a login or operation timeout > seems too severe for this process and provides an opening for a denial of > service attack. Also there is no other rationale in the document as to why > this semantic is useful. I assume you are referring to what is written in the section on Error Handling (section 4.0). Denial of service is a problem inherent in all IP based protocols, and we cannot completely solve it. The initiator can wait a long time before it determines that it has timed out. TCP ensures ordered delivery as long at there is a connection. What other alternative is there other than to completely clean up, once it has been decided that we have a connection problem? > * A general mapping of error recovery for iSCSI is needed, i.e. what parts > need definition versus what will use TCP error recovery mechanisms. Did you have a particular situation in mind that iSCSI does not cover? > * In section 3.17, Third party copy needs a much better explanation about > authentication, login and how the entire process works. Again, this is a framework. When devices start offering third party commands that go beyond the provisions of iSCSI, we will extend it. We know about and we think we covered the extended copy commands considered by the SCSI working group. > Specifics > --------- > * It should be stated specifically in sections 2.4 and 3.8 that iSCSI data > segments cannot overlap. We agree that the iSCSI should state that data segments should not overlap (and will do this in the next version). However we would be reluctant to require that receiver implementations check for this type of error and report it in the status. Is this acceptable? > * The expected data length and flags, i.e. command direction, should be > described in the SCB itself and not as separate fields in the SCSI command, > see section 3.3. As stated by SAM the SCB contains only the number of data blocks not the transfer length. SAM also mandates that the "execution request" include the data length and CAM (as well as other standard software interfaces) require a residual count report with reference to the length. It make all the implementations "more compliant" to include the length. For all hardware bridge providers it makes also more sense to have the length and direction in a "common" header than to scan SCBs. > * Using the task tag and TCP connection 4-tuple (source and destination IP > addresses and ports) we should have a fully qualified identifier and should > not need LUN number in the response and task management response, see > section 3.3 and 3.6. You are right - it was so many times on and off! It ended up being there to make all controls "target-to-initiator" identical. The last reasoning behind getting it in was a "proxy LUN" - i.e. the work was done by a "third party". If the returned LUN disagrees with the transmitted LUN then it may mean that a proxy satisfied the request. However, we have not specified what action should be taken and I cannot at present think of anything useful to do with any proxy-LUN information. We (the working group, including you hopefully, in its infinite wisdom!) might decide to remove it. > * The LUN number should be embedded in the data for the AEN, see section > 3.4. We do not specify what goes in the data that is sent in an Asynchronous Error Notification. I think SAM-2 requires LUN to be specified (as a parameter). We want to be independent of whatever data is packaged, and we therefore have to specify the LUN in the header. > * In section 5.1 a recommendation is made to use 8k as the upper limit for > small TCP segments. Depending on the MTU size this recommendation may cause > fragmentation. More detail and analysis are needed to justify this > recommendation. 8k is an upper limit. If MTU size is smaller, then a smaller data size should be used, as implied by the note to the implementer. 8k is also an upper limit for good CRC algorithms (perhaps 8k is too big for this also). We welcome a more detailed analysis to provide a better recommendation. > * A standard CRC should be required, see section 6.1. A agree that a good CRC is a thing to have. I think that a TCP-CRC should be mandated for the control channel. This should be set when opening the TCP connection for the control channel. There are cases where CRC is not desirable for the data connection, as when transferring transient voice or video . Hence there ought to be some kind of negotiation as to whether CRC will be used for the data channel (like a parameter for open). Let's talk some more about it. > * The target should not gets its name from the initiator, see section 10.1. The target can ignore any key:value pairs sent by the initiator, so it need not receive its name from the initiator. This feature is useful in case the target is actually a front end for many machines and/or disks, in which case the initiator can specify to which target it really wants to interact with. > * Section 10.3 needs to provide details on how to prevent reply/reuse. Also > this text seems to allow passwords in the clear which is not acceptable. The example given is conceptual. You can use encryption if you the initiator and target can agree on it, or if it automatically provided by the TCP layer. But we are ready to work some more on it. > * In section 10.5 it states "Once AllowNoRTT has been set to 'yes', it > cannot be set back to no". It should clarify this is for the open > connection and closing this connection and opening a new connection will > clear this condition. This was the intention. We will clarify. > Questions > -------- > * What value does the ability to do an iSCSI ping add to the existing > ability to do an ICMP ECHO? If little or none, this should be omitted, see > section 3.15. This is very valuable. First, ICMP may be blocked by a firewall. Second, it is very useful to test certain pathological data sets over particular networks. Third, when a TCP link is not being used, no data is sent. This makes it almost impossible to detect if the connection has been broken. Having a ping command allows the TCP connection to be tested periodically. And it tests more than just the TCP/IP stack - a valuable add-on in many settings. > TCP-RDMA > -------- > Although the premise of TCP acceleration is quite useful the concept of RDMA > does not apply for our application of internet SCSI. We will handle the > moving of data as implementation specific and not as generic design such as > RDMA. As they say - we all leave in free world... I would agree that you have a strong case for a controller but I am not that confident about a general purpose host adapter - like a NIC card (not SCSI specific) > --- end forwarded message by harwood, jack --- Regards, Julo Julian Satran (on behalf of all my colleagues), IBM Research at Haifa
Home Last updated: Tue Sep 04 01:08:17 2001 6315 messages in chronological order |