SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: Command Queue Depth (was asymmetric/Symmetric)



    
    The path I'd recommend is to allow people to oversubscribe a target's
    resources, and then to do a graceful recovery when that gets you into
    trouble.
    
    Note that drive designers do this all of the time - we tend to optimize for
    common cases, and then worry about how to handle outlyers using other
    mechanisms.  While a more complicated model, it gets you the best overall
    resource utilization.  We use read on arrival, ECC on the Fly, Retrys, auto
    reallocation, all in attempts to handle the common path quickly and the rare
    path more slowly, where the differences are error rates.  
    
    In this case I would allow initiators to send down data immediately  - when
    that works (like when ECC on the Fly works) you get a benefit.  If packets
    are dropped you can rely on existing mechanisms to recover, or you can put
    in a new, improved, and perhaps more friendly process.  In either case I
    think the result would probably be better than a tight credit based model
    where a lot of delays would be introduced and a lot of (historically
    unresolved) allocation policy issues arise.
    
    Jim
    
    
    
    -----Original Message-----
    From: Charles Monia [mailto:cmonia@NishanSystems.com]
    Sent: Thursday, September 07, 2000 1:52 PM
    To: Julian Satran (E-mail)
    Cc: Ips (E-mail)
    Subject: RE: Command Queue Depth (was asymmetric/Symmetric)
    
    
    Hi Julo:
    
    > -----Original Message-----
    > From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
    > Sent: Thursday, September 07, 2000 6:11 AM
    > To: ips@ece.cmu.edu
    > Subject: RE: Command Queue Depth (was asymmetric/Symmetric)
    > 
    > 
    >  
    > Dear colleagues,
    > 
    > Although the windowing mechanism in iSCSI-01 may seem to be 
    > there to solve
    > a queueing issue
    > it is mainly meant to limit the buffering space for commands 
    > that await
    > "de-skewing".
    > We assume that execution queue-lengths, policy etc. are 
    > beyond the scope of
    > transport.
    > 
    > As for SCSI queue length I assumed that the busy or queue full status
    > followed by an Asynch Event
    > message indicating readiness is the mechanism provided by 
    > SCSI to regulate
    > the command flow.
    > 
    > It is hard to imagine that give the variable life-time of 
    > SCSI commands and
    > the
    > opaque nature of the resources required to execute them  that 
    > the transport
    > has
    > to help in this area.
    > 
    
    While this issue has been discussed at some length in the past, as Jim
    McGrath stated, I believe the debate ought to be reopened (although we may
    end up reaching the same conclusion as before).
    
    As Ralph Weber pointed out, the SCSI model is to discard the command, return
    status, retrieve the next command in the transport pipeline and continue
    processing.  The initiator is not notified when processing resumes. If there
    are many commands in flight, as there could be in an IP environment, and
    target resources free up in the meantime, the result is commands processed
    out of order.
    
    Historically, such a lapse in command ordering was not seen as an issue for
    the following reasons:
    
    a) Strict ordering was not required by the most commonly deployed device
    types (disks and tapes). Due to the nature of disk traffic, a simple retry
    mechanism was deemed sufficient to recover from these errors. Since legacy
    streaming devices, such as tapes, did not support command queuing, command
    ordering considerations were not a factor there either.
    
    b) Transport delays over storage interconnects were small, so not many
    commands were apt to be in flight. i.e.. The window for such errors was very
    small.
    
    c)  The resource guarantees needed for a loss-avoidance mechanism in the
    target adversely effected device cost, especially at the high-volume,
    low-end of the market.
    
    Given the above considerations, there was little support within the storage
    community for measures addressing this issue.
    
    If we now believe that the iSCSI environment changes the rules, I believe
    the interconnect protocol can provide useful assists, such as:
    
    a)  On a command overflow condition, have the iSCSI target flush the command
    pipeline by returning status and discarding all subsequently received
    commands until a host acknowledgement is received.
    
    b)  Implement some sort of credit-based mechanism for overflow-avoidance.
    
    
    Comments?
    
    
    <Stuff deleted>
    
    > Jim McGrath <Jim.McGrath@quantum.com> on 07/09/2000 06:06:40
    > 
    > Please respond to Jim McGrath <Jim.McGrath@quantum.com>
    > 
    > To:   "'Matt Wakeley'" <matt_wakeley@agilent.com>, ips 
    > <ips@ece.cmu.edu>
    > cc:    (bcc: Julian Satran/Haifa/IBM)
    > Subject:  RE: Command Queue Depth   (was asymmetric/Symmetric)
    > 
    > 
    > 
    > 
    > 
    > The issue of buffer space allocation for multiple initiators 
    > has a long and
    > troubled history in SCSI.  We have never been able to come up 
    > with a good
    > answer.
    > 
    > Fibre Channel tried to fix this with the notion of "login BB 
    > credit" - when
    > you login you get a minimum number of credits you are always 
    > guaranteed
    > when
    > you start data transfers.  The problem with this is that 
    > storage devices
    > had
    > no realistic ability to discriminate between initiators or to 
    > change the
    > login BB credit.  In addition, the expectation is that all possible
    > initiators would get these credits on login.  So storage 
    > devices vendors
    > have played it safe and kept this number low (at 0 until recently, now
    > around 2).  For iSCSI the number of initial credits you need 
    > to "prime the
    > pump" until normal data flow is established is probably large 
    > (given the
    > latencies are higher than in Fibre Channel, especially FC-AL), and the
    > number of potential initiators larger than in Fibre Channel, 
    > making this a
    > whole lot worse for the storage device.
    > 
    > As soon as we allow the devices to start adjusting these 
    > credits, then you
    > have the protocol problem of making sure people know when 
    > their credits are
    > adjusted and the policy problem of how, who, and when to adjust the
    > credits.
    > Changing everyone's credit when you add a new initiator can get into a
    > notification nightmare, although it is "fair."  Any policy 
    > brings up all
    > sorts of nasty issues regarding fairness vs efficient use of the
    > transmission media.
    > 
    > Jim
    > 
    > Note: the same problem has plagued other attempts to allocate device
    > resources between multiple initiators, like command queue space.  In
    > general
    > policies with respect to multiple initiators are not really 
    > standard in the
    > SCSI world.
    > 
    > 
    > -----Original Message-----
    > From: Matt Wakeley [mailto:matt_wakeley@agilent.com]
    > Sent: Wednesday, September 06, 2000 2:56 PM
    > To: ips
    > Subject: Re: Command Queue Depth (was asymmetric/Symmetric)
    > 
    > 
    > Joshua Tseng wrote:
    > 
    > > James,
    > >
    > > I agree with others that there may be an issue with the
    > > command windowing mechanism in the existing iSCSI spec.  It is like
    > > "TCP in reverse", in that the target determines the size of 
    > the window,
    > and
    > > not the initiator as in TCP.  Rather, I believe that 
    > everything that this
    > > windowing mechanism is attempting to achieve can be more 
    > easily obtained
    > > by having the target communicate its buffer size to the initiator at
    > > iSCSI login.  It should be the role of the initiator to 
    > determine how
    > > many commands to put in flight simultaneously, given this input on
    > available
    > > buffer size from the target.
    > 
    > As more initiators connect to a target, it may need to scale back the
    > amount
    > of
    > this buffering it has allocated to each previously logged in 
    > initiator (to
    > prevent rejecting new logins).
    > 
    > >
    > >
    > > As far as multiple initiators, could this not be resolved 
    > by the target
    > > refusing additional logins beyond the number of initiators 
    > it can safely
    > > support?  Not being a storage expert, this is my best 
    > guess/suggestion
    > > at how to do it.
    > 
    > I believe John already answered this...
    > 
    > -Matt
    > 
    > 
    >
    
    Charles Monia
    Senior Technology Consultant
    Nishan Systems Corporation
    email: cmonia@nishansystems.com
    voice: (408) 519-3986
    fax:   (408) 435-8385
     
    


Home

Last updated: Tue Sep 04 01:07:29 2001
6315 messages in chronological order