SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: Avoiding deadlock in iSCSI



    
    
    That is a big issue - if what I hear is true. It isn't even related to
    latency but with having a queue. If you have a long running command and a
    queue of 1 (awaiting execution) you can
    have this error. I was (mistakenly) confident that it is handled by ACA.
    Is it safe to asume that T10 will handle it?
    
    Julo
    
    
    
    
    Jim McGrath <Jim.McGrath@quantum.com> on 12/09/2000 23:09:00
    
    Please respond to Jim McGrath <Jim.McGrath@quantum.com>
    
    To:   "'csapuntz@cisco.com'" <csapuntz@cisco.com>, Jim McGrath
          <Jim.McGrath@quantum.com>
    cc:   ips@ece.cmu.edu (bcc: Julian Satran/Haifa/IBM)
    Subject:  RE: Avoiding deadlock in iSCSI
    
    
    
    
    
    I agree with Stephen that this is a T10 issue.  BTW, it has been discussed
    before in T10.  The issue is that for commands ending with a CHECK
    CONDITION
    (error) status the ACA mechanism provides a very targeted means of solving
    this problem (use ACA for your ORDERED commands and you are fine).  The
    QErr
    bit in the Control Mode page provides a similar global capability for the
    entire LUN.
    
    The problem is that there are other status conditions that may want to be
    treated in a similar manner, but which (to my knowledge) are not.  The
    first
    one we looked at was RESERVATION CONFLICT.  The second is QUEUE FULL.  Both
    may create a desire to pause execution from the command queue.
    
    The current answer is simply not to get into this problem to begin with.
    That is, if you are sending an ORDERED command, make sure you do not send
    out the next command (ORDERED or UNORDERED) until you (the initiator) know
    that the first command has been received by the target.  You clearly know
    if
    a command has been received by an explicit acknowledgement (as in parallel
    SCSI) or by a subsequent action (a data transfer or status transfer) that
    only makes sense if the command was received.
    
    You can argue that this degrades performance.  For most of the
    interconnects
    considered up to now, with relatively low latency, this has not bee as much
    of a concern.  In addition, there has been a feeling that the cases were
    ORDERED commands are used are so few that you either accepted the
    degradation (such as it is) when you use them, or you do a vendor unique
    approach around the problem.
    
    If this is considered to be a big issue, then I'd raise it to T10.  I
    suggest you propose a solution that mimics the existing ACA and QErr
    solutions, since that will be easiest and quickest to adopt.  One
    possibility is adding the status QUEUE FULL to the things that generate
    ACAs.  Another is to propose a new QACA bit that can be used like ACA but
    for Queue Full (so users can continue to use the old ACA bit for backward
    compatibility).  Since you either introduce a compatibility issue (the
    first
    approach) or need a bit (the second approach), the answer is not cost free,
    so T10 will want some justification for creating a compatibility issue or
    (more likely) to spend the bit.
    
    Jim
    
    
    -----Original Message-----
    From: csapuntz@cisco.com [mailto:csapuntz@cisco.com]
    Sent: Monday, September 11, 2000 10:33 PM
    To: Jim McGrath
    Cc: ips@ece.cmu.edu; csapuntz@cisco.com
    Subject: Re: Avoiding deadlock in iSCSI
    
    
    
    > Note that SCSI targets, when faced with getting a command queue full, do
    not
    > stop reading from the interconnect.  If more commands are received then
    they
    > respond with a QUEUE FULL status.  If data is received then they receive
    > that data without regard for the status of the command queue (as long as
    it
    > is data for an already queued command).  This eliminates the potential
    for
    > deadlock between command and data queues.
    
    Jim,
    
    I believe there is problem with the current SCSI behavior. Consider the
    following scenario for a host with a pathological queue of 1
    
    
         1) Initiator sends command 1 (ORDERED attribute)
         2) Initiator sends command 2 (ORDERED attribute)
         3) Initiator sends command 3 (ORDERED attributge)
         4) Target reads command 1
         5) Target reads command 2
         6) Target returns queue full for command 2
         7) Command 1 completes
         8) Target reads command 3
         9) Target executes command 3
    
    We have just violated the ordering constraints of the application by
    doing command 3 before command 2.
    
    > Potentially you can have a situation where multiple commands already at
    the
    > target have only part of their data transmitted, with the remainder still
    at
    > the initiator(s), and then run out of buffer space for data.  If the
    target
    > uses a credit model to pace the reception of data, it can also make sure
    > this never happens.  Unsolicited data, even for commands already queued,
    can
    > end up creating this deadlock - which is why unsolicited data systems
    either
    > have to have a tight limit on the resources it can use (e.g. low login BB
    > credit in Fibre Channel terms) or some sort of clean (i.e. not IO
    > terminating) rejection mechanism from target to initiator (like in USB).
    
    Unsolicited data is NOT a problem with the current iSCSI spec. We allow
    the target to always drop data and request data transfers
    with an RTT.
    
    -Costa
    
    
    
    


Home

Last updated: Tue Sep 04 01:07:18 2001
6315 messages in chronological order