|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: ISCSI: Urgent Flag requirement violates TCP.Dave Black wrote: > It is not clear to me that the Urgent feature is "required for > interoperation or to limit behavior which has potential for > causing harm". I'm prepared to be convinced otherwise, and would > like to hear from implementers other than Matt on this subject, > and specifically comments on his statement that: > "... high speed implementations will require framing in order > to prevent a massive amount of buffer resources to 'buffer up' TCP > segments that arrive after a dropped TCP segment." My apology for taking so long to respond to this request. I support Matt whole heartedly. A TCP-Offload-Engine (TOE), a hardware-aid TCP implementation with zero-copy function, is essential for Gigabit-plus Ethernet and Fibre Channel and InfiniBand adapters supporting iSCSI over TCP. Using urgent bit to identify the beginning of an iSCSI message enables the TOE adapter to parse an incoming TCP/IP segment quickly and deals with out-of-order and duplicated frames efficiently. Most arguments against Matt's position were based on existing software TCP implementation. While supporting TCP 100%, the TOE adapter does require some changes to the TCP implementation at installation. The changes are necessary to enable the zero-copy function. However, an TOE adapter with its hardware and software will inter-operate with any existing TCP implementation on any client or server by following the TCP spec. An TOE is a multi-function adapter that supports both TCP/IP and iSCSI. The NFS implementation with UDP or TCP over IP can be supported by a scatter/gather DMA list which splits the TCP/IP header from the data payload such that the latter is copied directly from and to the NFS cache buffers. This is the essence of the zero-copy function. To deal with out-of-order and duplicated frames, the TOE adapter works with one IP packet at a time with a score card that tracks all incoming segments. The maximum IP packet size is 65K. Some implementations prefer "jumbo" packets or frames. Inside each IP packet, the TOE adapter finds the UDP/TCP headers. For iSCSI support, the TOE adapter receives its SCSI requests directly from the iSCSI driver instead from a TCP/IP driver. To avoid duplicating the TCP run time parameters, the "OPTIONAL" second connection will be used for SCSI commands while another TCP connection will be used for login, logout, and other task administrative matters. Needless to say, there are many concurrent sessions to many different targets or initiators. Hundreds or thousands of concurrent SCSI commands to different TCP endpoints are implemented by an exchange table as I have discussed before this posting. The adapter must move the incoming or outgoing TCP segments at the speed of the media and generate ACK's (or SACK's) quickly. As a target, the TOE adapter passes incoming SCSI commands directly back to the waiting application software like RAID or tape or JBOD storage devices. For outgoing frames or packets, the TOE adapter creates TCP segments with IP headers. It may bundle several iSCSI PDUs into one TCP/IP segment destined for the same target. For incoming frames, the TOE adapter must know the iSCSI message boundary. This is why the urgent bit is extremely useful. Without it, the TOE adapter must buffer the whole IP packet before it can process an iSCSI header. While an TOE adapter can deal with a 65K IP packet with ease, the "jumbo" frame places a large SRAM demand on the adapter. Several incoming packets from different sources just aggregate the SRAM requirements. For iSCSI, jumbo frames are very useful for clients or servers thousands miles away. The TOE adapter must deal with hundreds of jumbo frames inflight. We are dealing with many gigantic TCP windows from many connections. Many objections of the "MUST" word were based on out-of-order delivery of SCSI commands. For outgoing SCSI commands, the TOE adapter will deliver them in the same order received from its iSCSI driver. There is no problem here. For incoming SCSI commands, while honoring the TCP sequence numbers, an TOE adapter operates in the same manner as current SCSI, 1394, and fibre channel adapters, meaning, no guarantee to in-order command execution. To illustrate this, lets use an example of command A being followed by command B closely. For a SCSI adapter, if a target gets the command A with bus parity check, it would return check status to command A and proceed to accept B happily, even there is dependency between command A and B. For an initiator device, the check status on A never blocks the delivery of B. This out-of-order deliver is OK because if B depended on A, all file system software would hold command B until the completion of command A. For a 1394 adapter commands A and B are stored in two ORB's. A target 1394 device will fetch the ORB's. Again, after encountering error in fetching the ORB for A, a target device will proceed to fetch the ORB for B. For a fibre channel adapter, an initiator will send commands A and B in two separate FCP_CMD frames. If frame A arrives with bad CRC, a target device simply throws the frame away and proceeds with execution of command B, if it is arrived with good CRC. Therefore, as Matt has stated, it is OK to deliver command B even if the command A segment is still missing. The urgent bit allows us to do that. One objection to out-of-order delivery uses the aborting of a non-existing command as an example. But, abort is never deterministic. An aborted command may be either non-existence or already completed. This theme is known to all adapters today. I do believe the iSCSI WG should facilitate the implementation of TOE adapters as well as accommodate the traditional TCP implementation. However, in either cases, no TCP changes. (Personally, I would like to see some changes. But, I learned my lessons earlier on this. :-)) Y.P. Cheng, CTO, ConnectCom Solutions Corp.
Home Last updated: Tue Sep 04 01:06:26 2001 6315 messages in chronological order |