RE: Command Queue Depth (was asymmetric/Symmetric)

To: "'Matt Wakeley'" <matt_wakeley@agilent.com>, Jim McGrath <Jim.McGrath@quantum.com>, ips <ips@ece.cmu.edu>
Subject: RE: Command Queue Depth (was asymmetric/Symmetric)
From: Jim McGrath <Jim.McGrath@quantum.com>
Date: Sun, 10 Sep 2000 21:37:08 -0700
Content-Type: text/plain;charset="windows-1252"
Sender: owner-ips@ece.cmu.edu

Matt,

I agree that the T10 has to take a role in these issues - I was just
pointing out that they are not new issues, T10 has discussed them before in
the context of parallel SCSI and Fibre Channel, and has never come up with a
very good solution.  So I would advise caution on expecting iSCSI (whether
the work is done here or in T10) to come up with a good solution anytime
soon (basically this whole area is a "research project").

On the issue of latency, the key here is that different physical transports
and (more importantly) the configurations they enable/use have a dramatic
impact on the absolute amount of buffer space, credits, etc  required for
good performance.  As an example, while you are right that the elasticity
buffer in FC-AL is a major latency issue, it is not large in absolute terms
(today I would expect < 1 us for the sum of all the latencies for all the
elasticity buffers in the type of system you mentioned, a 20 drive loop).
By contrast, store and forward switches introduce at least tens of us of
latency per switch in the path - and that is if everything is done at
hardware speeds.  While not uncommon for layer-2 Ethernet switches, layer 3
and layer 4 switches are still usually much slower and/or more expensive due
to the (historical) complexity of that degree of automation for those
protocols, and thus the reliance on very fast (and expensive) CPUs.

Of greater import is that FC-AL is so limited that it really is physically
impossible to get too high a latency.  By contrast, running TCP/IP on some
mixture of physical plants, with potentially global transmission paths, run
up latencies in the ms, if not seconds, range.  While this is a great
strength of the internet, it is a real problem in this context (100 ms
latencies imply hundreds of MB of buffering for Gbit speeds - or losing a
lot of performance if the protocol requires a lot of turnaround delays for
things like getting buffer credits).

As usual, confining the application space an make your job a lot easier.
But if iSCSI is targeting the general, global, high speed, TCP/IP
environment, then you have a lot of issues.


Jim

Personally, I'd rather restrict the application space some more to make the
job easier.



-----Original Message-----
From: Matt Wakeley [mailto:matt_wakeley@agilent.com]
Sent: Thursday, September 07, 2000 10:25 PM
To: Jim McGrath; ips
Subject: Re: Command Queue Depth (was asymmetric/Symmetric)


Jim McGrath wrote:

> I agree that BB credit has nothing to do with commands per se, but
> illustrates the problem we have had with deciding on policies for the
> distributed allocation of device resources over multiple initiators.  In
my
> postscript I noted that the same problems have arisen on command queue
(how
> many queue slots do you get), with similarly no satisfactory solution.

So it sounds to me like this distribution of (command) resources across
initiators is a T10 SCSI issue, and should be solved there, not by each
individual transport (yesterday FC - which didn't solve it, today iSCSI,
tomorrow IB or whatever).

You've lost me on the following two paragraphs... Ethernet doesn't require
credits to send frames - it's a ship and pray model.  I don't know what the
latency through ethernet switches is, but I'd hope it wasn't in the ms
range.
Finally, a "usefull" FC-AL will have many devices on it (say 20 drives in a
JBOD) and the latency of the elastic store really starts adding up.  So I
still
contend that FC-AL is not a low latency medium.

-Matt

> On FC-AL, the latency to get an initial credit is typically measured in us
> for a couple of reasons.  First, much of that logic has been automated in
> the interface hardware (indeed, the major source of delay is typically the
> elasticity buffer, which is not store and forward and so has a very low
> latency compared to many switches or routers).  Second, the distances are
> very small (e.g. hundreds of meters), so both transmission time and the
> opportunity for intervening devices to increase latency is lower than in
the
> general internet world.
>
> So generally 2 credits (of 2Kbyte frames) is enough to cover the latency
and
> get you into streaming.  If the latency of the system was measured in ms,
> then a transport on a Gibt wire would require more like 50 initial credits
> (or more) to cover the latency.  Unless we are designing for a low latency
> environment for the exchange of credits (like those where FC-AL are used),
> then we probably need to allocate so much buffer space that it becomes
> difficult to promise initial credits to a lot of potential initiators.
>
> Jim
>
> PS general Fibre Channel (e.g. with switches and the like) is a bit
> different.
>
> -----Original Message-----
> From: Matt Wakeley [mailto:matt_wakeley@agilent.com]
> Sent: Wednesday, September 06, 2000 8:23 PM
> To: ips
> Subject: Re: Command Queue Depth (was asymmetric/Symmetric)
>
> Jim,
>
> I agree that FC has tried (unsuccessfully) to address this command queue
> allocation problem.
>
> However, the "login bb credit" mechanism in FC does not address the
command
> queue depth issue at all.  BB credit is used to receive commands and/or
> data,
> and the target has no clue in advance what is coming.  BB credit is just
> there
> to ensure that there is a lowest layer buffer available to receive the FC
> frame
> (as opposed to dropping it on the floor if there is no "mac" buffer, like
> ethernet does).  It does not mean the command queue has any room for the
> frame.
>
> At one time, there was a big push to have "data" credits and "command"
> credits
> to take care of this problem, but it couldn't be made to work and be
> "backwards
> compatible".
>
> > The issue of buffer space allocation for multiple initiators has a long
> and
> > troubled history in SCSI.  We have never been able to come up with a
good
> > answer.
> >
> > Fibre Channel tried to fix this with the notion of "login BB credit" -
> when
> > you login you get a minimum number of credits you are always guaranteed
> when
> > you start data transfers.  The problem with this is that storage devices
> had
> > no realistic ability to discriminate between initiators or to change the
> > login BB credit.  In addition, the expectation is that all possible
> > initiators would get these credits on login.  So storage devices vendors
> > have played it safe and kept this number low (at 0 until recently, now
> > around 2).  For iSCSI the number of initial credits you need to "prime
the
> > pump" until normal data flow is established is probably large (given the
> > latencies are higher than in Fibre Channel, especially FC-AL), and the
>
> How is the latency low on FC-AL?? given that you need to arbitrate and win
> the
> loop, then receive bb credit, before you can send anything?
>
> -Matt
>
> >
> > number of potential initiators larger than in Fibre Channel, making this
a
> > whole lot worse for the storage device.
> >
> > As soon as we allow the devices to start adjusting these credits, then
you
> > have the protocol problem of making sure people know when their credits
> are
> > adjusted and the policy problem of how, who, and when to adjust the
> credits.
> > Changing everyone's credit when you add a new initiator can get into a
> > notification nightmare, although it is "fair."  Any policy brings up all
> > sorts of nasty issues regarding fairness vs efficient use of the
> > transmission media.
> >
> > Jim
> >
> > Note: the same problem has plagued other attempts to allocate device
> > resources between multiple initiators, like command queue space.  In
> general
> > policies with respect to multiple initiators are not really standard in
> the
> > SCSI world.
> >
> > -----Original Message-----
> > From: Matt Wakeley [mailto:matt_wakeley@agilent.com]
> > Sent: Wednesday, September 06, 2000 2:56 PM
> > To: ips
> > Subject: Re: Command Queue Depth (was asymmetric/Symmetric)
> >
> > Joshua Tseng wrote:
> >
> > > James,
> > >
> > > I agree with others that there may be an issue with the
> > > command windowing mechanism in the existing iSCSI spec.  It is like
> > > "TCP in reverse", in that the target determines the size of the
window,
> > and
> > > not the initiator as in TCP.  Rather, I believe that everything that
> this
> > > windowing mechanism is attempting to achieve can be more easily
obtained
> > > by having the target communicate its buffer size to the initiator at
> > > iSCSI login.  It should be the role of the initiator to determine how
> > > many commands to put in flight simultaneously, given this input on
> > available
> > > buffer size from the target.
> >
> > As more initiators connect to a target, it may need to scale back the
> amount
> > of
> > this buffering it has allocated to each previously logged in initiator
(to
> > prevent rejecting new logins).
> >
> > >
> > >
> > > As far as multiple initiators, could this not be resolved by the
target
> > > refusing additional logins beyond the number of initiators it can
safely
> > > support?  Not being a storage expert, this is my best guess/suggestion
> > > at how to do it.
> >
> > I believe John already answered this...
> >
> > -Matt
Prev by Date: RE: Command Queue Depth (was asymmetric/Symmetric)
Next by Date: RE: Command Queue Depth (was asymmetric/Symmetric)
Prev by thread: RE: Command Queue Depth (was asymmetric/Symmetric)
Next by thread: a vote for asymmetric connections in a session
Index(es):
- Date
- Thread
Home
Last updated: Tue Sep 04 01:07:25 2001
6315 messages in chronological order