RE: Command Queue Depth (was asymmetric/Symmetric)

To: "'Charles Monia'" <cmonia@NishanSystems.com>, "Julian Satran (E-mail)" <julian_satran@il.ibm.com>
Subject: RE: Command Queue Depth (was asymmetric/Symmetric)
From: Jim McGrath <Jim.McGrath@quantum.com>
Date: Thu, 7 Sep 2000 20:43:52 -0700
Cc: "Ips (E-mail)" <ips@ece.cmu.edu>
Content-Type: text/plain;charset="windows-1252"
Sender: owner-ips@ece.cmu.edu

The path I'd recommend is to allow people to oversubscribe a target's
resources, and then to do a graceful recovery when that gets you into
trouble.

Note that drive designers do this all of the time - we tend to optimize for
common cases, and then worry about how to handle outlyers using other
mechanisms.  While a more complicated model, it gets you the best overall
resource utilization.  We use read on arrival, ECC on the Fly, Retrys, auto
reallocation, all in attempts to handle the common path quickly and the rare
path more slowly, where the differences are error rates.  

In this case I would allow initiators to send down data immediately  - when
that works (like when ECC on the Fly works) you get a benefit.  If packets
are dropped you can rely on existing mechanisms to recover, or you can put
in a new, improved, and perhaps more friendly process.  In either case I
think the result would probably be better than a tight credit based model
where a lot of delays would be introduced and a lot of (historically
unresolved) allocation policy issues arise.

Jim

-----Original Message-----
From: Charles Monia [mailto:cmonia@NishanSystems.com]
Sent: Thursday, September 07, 2000 1:52 PM
To: Julian Satran (E-mail)
Cc: Ips (E-mail)
Subject: RE: Command Queue Depth (was asymmetric/Symmetric)

Hi Julo:

> -----Original Message-----
> From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> Sent: Thursday, September 07, 2000 6:11 AM
> To: ips@ece.cmu.edu
> Subject: RE: Command Queue Depth (was asymmetric/Symmetric)
> 
> 
>  
> Dear colleagues,
> 
> Although the windowing mechanism in iSCSI-01 may seem to be 
> there to solve
> a queueing issue
> it is mainly meant to limit the buffering space for commands 
> that await
> "de-skewing".
> We assume that execution queue-lengths, policy etc. are 
> beyond the scope of
> transport.
> 
> As for SCSI queue length I assumed that the busy or queue full status
> followed by an Asynch Event
> message indicating readiness is the mechanism provided by 
> SCSI to regulate
> the command flow.
> 
> It is hard to imagine that give the variable life-time of 
> SCSI commands and
> the
> opaque nature of the resources required to execute them  that 
> the transport
> has
> to help in this area.
> 

While this issue has been discussed at some length in the past, as Jim
McGrath stated, I believe the debate ought to be reopened (although we may
end up reaching the same conclusion as before).

As Ralph Weber pointed out, the SCSI model is to discard the command, return
status, retrieve the next command in the transport pipeline and continue
processing.  The initiator is not notified when processing resumes. If there
are many commands in flight, as there could be in an IP environment, and
target resources free up in the meantime, the result is commands processed
out of order.

Historically, such a lapse in command ordering was not seen as an issue for
the following reasons:

a) Strict ordering was not required by the most commonly deployed device
types (disks and tapes). Due to the nature of disk traffic, a simple retry
mechanism was deemed sufficient to recover from these errors. Since legacy
streaming devices, such as tapes, did not support command queuing, command
ordering considerations were not a factor there either.

b) Transport delays over storage interconnects were small, so not many
commands were apt to be in flight. i.e.. The window for such errors was very
small.

c)  The resource guarantees needed for a loss-avoidance mechanism in the
target adversely effected device cost, especially at the high-volume,
low-end of the market.

Given the above considerations, there was little support within the storage
community for measures addressing this issue.

If we now believe that the iSCSI environment changes the rules, I believe
the interconnect protocol can provide useful assists, such as:

a)  On a command overflow condition, have the iSCSI target flush the command
pipeline by returning status and discarding all subsequently received
commands until a host acknowledgement is received.

b)  Implement some sort of credit-based mechanism for overflow-avoidance.

Comments?

<Stuff deleted>

> Jim McGrath <Jim.McGrath@quantum.com> on 07/09/2000 06:06:40
> 
> Please respond to Jim McGrath <Jim.McGrath@quantum.com>
> 
> To:   "'Matt Wakeley'" <matt_wakeley@agilent.com>, ips 
> <ips@ece.cmu.edu>
> cc:    (bcc: Julian Satran/Haifa/IBM)
> Subject:  RE: Command Queue Depth   (was asymmetric/Symmetric)
> 
> 
> 
> 
> 
> The issue of buffer space allocation for multiple initiators 
> has a long and
> troubled history in SCSI.  We have never been able to come up 
> with a good
> answer.
> 
> Fibre Channel tried to fix this with the notion of "login BB 
> credit" - when
> you login you get a minimum number of credits you are always 
> guaranteed
> when
> you start data transfers.  The problem with this is that 
> storage devices
> had
> no realistic ability to discriminate between initiators or to 
> change the
> login BB credit.  In addition, the expectation is that all possible
> initiators would get these credits on login.  So storage 
> devices vendors
> have played it safe and kept this number low (at 0 until recently, now
> around 2).  For iSCSI the number of initial credits you need 
> to "prime the
> pump" until normal data flow is established is probably large 
> (given the
> latencies are higher than in Fibre Channel, especially FC-AL), and the
> number of potential initiators larger than in Fibre Channel, 
> making this a
> whole lot worse for the storage device.
> 
> As soon as we allow the devices to start adjusting these 
> credits, then you
> have the protocol problem of making sure people know when 
> their credits are
> adjusted and the policy problem of how, who, and when to adjust the
> credits.
> Changing everyone's credit when you add a new initiator can get into a
> notification nightmare, although it is "fair."  Any policy 
> brings up all
> sorts of nasty issues regarding fairness vs efficient use of the
> transmission media.
> 
> Jim
> 
> Note: the same problem has plagued other attempts to allocate device
> resources between multiple initiators, like command queue space.  In
> general
> policies with respect to multiple initiators are not really 
> standard in the
> SCSI world.
> 
> 
> -----Original Message-----
> From: Matt Wakeley [mailto:matt_wakeley@agilent.com]
> Sent: Wednesday, September 06, 2000 2:56 PM
> To: ips
> Subject: Re: Command Queue Depth (was asymmetric/Symmetric)
> 
> 
> Joshua Tseng wrote:
> 
> > James,
> >
> > I agree with others that there may be an issue with the
> > command windowing mechanism in the existing iSCSI spec.  It is like
> > "TCP in reverse", in that the target determines the size of 
> the window,
> and
> > not the initiator as in TCP.  Rather, I believe that 
> everything that this
> > windowing mechanism is attempting to achieve can be more 
> easily obtained
> > by having the target communicate its buffer size to the initiator at
> > iSCSI login.  It should be the role of the initiator to 
> determine how
> > many commands to put in flight simultaneously, given this input on
> available
> > buffer size from the target.
> 
> As more initiators connect to a target, it may need to scale back the
> amount
> of
> this buffering it has allocated to each previously logged in 
> initiator (to
> prevent rejecting new logins).
> 
> >
> >
> > As far as multiple initiators, could this not be resolved 
> by the target
> > refusing additional logins beyond the number of initiators 
> it can safely
> > support?  Not being a storage expert, this is my best 
> guess/suggestion
> > at how to do it.
> 
> I believe John already answered this...
> 
> -Matt
> 
> 
>

Charles Monia
Senior Technology Consultant
Nishan Systems Corporation
email: cmonia@nishansystems.com
voice: (408) 519-3986
fax:   (408) 435-8385

Follow-Ups:
- Re: Command Queue Depth (was asymmetric/Symmetric)
  - From: csapuntz@cisco.com

Prev by Date: RE: Command Queue Depth (was asymmetric/Symmetric)
Next by Date: Re: Symmetric vs Asymmetric
Prev by thread: Re: Command Queue Depth (was asymmetric/Symmetric)
Next by thread: Re: Command Queue Depth (was asymmetric/Symmetric)
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:07:29 2001
6315 messages in chronological order