RE: iSCSI: CmdSN and Retry

To: "'Ips@Ece. Cmu. Edu'" <ips@ece.cmu.edu>
Subject: RE: iSCSI: CmdSN and Retry
From: "Y P Cheng" <ycheng@advansys.com>
Date: Thu, 1 Feb 2001 12:07:49 -0800
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;charset="iso-8859-1"
Importance: Normal
In-Reply-To: <C12569E5.002988C4.00@d12mta02.de.ibm.com>
Sender: owner-ips@ece.cmu.edu

Julo wrote:
> Except for sending the status - an executing command helds-up the LU queue
> and makes the "local" recovery simpler than clearing the LU queue and
> resending the commands.
Santosh wrote:
> This is correct, in the case where the target does NOT implement
> data/status recovery. i.e. Assume ordering is required, 2 commands , say,
> 1 & 2, executed in order at the target. Now, if 1 encountered a digest
error
> or format error at the initiator, and was re-sent with the "retry" bit
> AND the target were to NOT implement data/status recovery, it would result
> in target executing 1, 2, 1. This may be a problem and canot be addressed,
> unless iSCSI mandates data/status recovery.

I certainly understand the need of doing data/status recovery and the
argument of "local" recovery being simpler.  However, this comes with heavy
cost of performance when pipelined design demands 100,000 IOs per second on
a network with long delay.  For services and responses happening a few times
per second, it is OK to hold on the resources until we are certain the ACK
is returned.  However, in the example above, after completing command 1 if a
target can't start command 2 until the status for 1 is ACK'ed, the wait can
be 100 milliseconds on a network with long delay.  The wait make it
impossible to have large number of IOs in the pipeline. By mandating
data/status recovery in iSCSI, we change the pipelined command execution to
interlock handshakes.  As I have said in the previous email, an initiator
will never send an command which depends on the success of a previous
command.  This fact makes the pipeline execution in a target possible.

On a separate note, I really respect Santosh's fine-tooth analysis of the
iSCSI draft.  But, in his arguments the fact that SCSI has been functional
for the last 20 years was badly ignored.  The CmdSN, DataSN, and StatSN
allow iSCSI to detect missing PDUs and to quickly ask for retransmit.  They
should not be used to enforce sequentiality to slow things down.  SCSI
already has the semantics of ordered execution that requires the help of
CmdSN when multiple TCP connections are used. However, using StatSN to
mandate data/status retry pays a great performance price. Both overlapped
and out-of-order data transfers are allowed in SCSI (Check out the Modify
Data Pointer extended message).  SCSI works fine without mandating
non-overlapping transfers or data/status recovery.  Retry can be done in a
simple and clean manner without introducing complicated semantics for CmdSN,
DataSN, and StatSN.  Note, if we must retry more than once in a million IOs,
something is wrong of the infrastructure.  Therefore, let the pipeline flow
quickly and don't optimize the retry.

As long as we separate the TCP, iSCSI, and SCSI ULP layers cleanly -- for
which this WG has done a good job -- SCSI will continue to work.  Without
wasting more bandwidth on this subject, I will be willing to discuss the
SCSI retry implementations with anyone offline.

Follow-Ups:
- Re: iSCSI: CmdSN and Retry
  - From: Santosh Rao <santoshr@cup.hp.com>
- RE: iSCSI: CmdSN and Retry
  - From: "Douglas Otis" <dotis@sanlight.net>

References:
- Re: iSCSI: CmdSN and Retry
  - From: julian_satran@il.ibm.com

Prev by Date: RE: iSCSI Data Integrity - Digests
Next by Date: RE: iSCSI : Holes in StatSN
Prev by thread: Re: iSCSI: CmdSN and Retry
Next by thread: RE: iSCSI: CmdSN and Retry
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:05:36 2001
6315 messages in chronological order