|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: iSCSI: CmdSN and RetryJulo wrote: > Except for sending the status - an executing command helds-up the LU queue > and makes the "local" recovery simpler than clearing the LU queue and > resending the commands. Santosh wrote: > This is correct, in the case where the target does NOT implement > data/status recovery. i.e. Assume ordering is required, 2 commands , say, > 1 & 2, executed in order at the target. Now, if 1 encountered a digest error > or format error at the initiator, and was re-sent with the "retry" bit > AND the target were to NOT implement data/status recovery, it would result > in target executing 1, 2, 1. This may be a problem and canot be addressed, > unless iSCSI mandates data/status recovery. I certainly understand the need of doing data/status recovery and the argument of "local" recovery being simpler. However, this comes with heavy cost of performance when pipelined design demands 100,000 IOs per second on a network with long delay. For services and responses happening a few times per second, it is OK to hold on the resources until we are certain the ACK is returned. However, in the example above, after completing command 1 if a target can't start command 2 until the status for 1 is ACK'ed, the wait can be 100 milliseconds on a network with long delay. The wait make it impossible to have large number of IOs in the pipeline. By mandating data/status recovery in iSCSI, we change the pipelined command execution to interlock handshakes. As I have said in the previous email, an initiator will never send an command which depends on the success of a previous command. This fact makes the pipeline execution in a target possible. On a separate note, I really respect Santosh's fine-tooth analysis of the iSCSI draft. But, in his arguments the fact that SCSI has been functional for the last 20 years was badly ignored. The CmdSN, DataSN, and StatSN allow iSCSI to detect missing PDUs and to quickly ask for retransmit. They should not be used to enforce sequentiality to slow things down. SCSI already has the semantics of ordered execution that requires the help of CmdSN when multiple TCP connections are used. However, using StatSN to mandate data/status retry pays a great performance price. Both overlapped and out-of-order data transfers are allowed in SCSI (Check out the Modify Data Pointer extended message). SCSI works fine without mandating non-overlapping transfers or data/status recovery. Retry can be done in a simple and clean manner without introducing complicated semantics for CmdSN, DataSN, and StatSN. Note, if we must retry more than once in a million IOs, something is wrong of the infrastructure. Therefore, let the pipeline flow quickly and don't optimize the retry. As long as we separate the TCP, iSCSI, and SCSI ULP layers cleanly -- for which this WG has done a good job -- SCSI will continue to work. Without wasting more bandwidth on this subject, I will be willing to discuss the SCSI retry implementations with anyone offline.
Home Last updated: Tue Sep 04 01:05:36 2001 6315 messages in chronological order |