Re: iSCSI: Out of order commands

To: John Hufferd <hufferd@us.ibm.com>
Subject: Re: iSCSI: Out of order commands
From: "Mallikarjun C." <cbm@rose.hp.com>
Date: Fri, 09 Nov 2001 12:29:10 -0800
Cc: ips@ece.cmu.edu
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=us-ascii
Organization: Hewlett-Packard, Roseville
References: <OF1546DB59.B0BB3152-ON88256AFF.0008CE10@boulder.ibm.com>
Reply-To: cbm@rose.hp.com
Sender: owner-ips@ece.cmu.edu
John Hufferd wrote:
....

> With in order command arrival on a connection (as a normal event) seems to
> provide the quick determination of an error in a Command PDU, and tell the
> recovery code quickly to get the missing command resent.  What do you think
> will be the effect of not knowing if the command is delayed in the
> initiator or whether it was dropped because of a Header Digest Error?
> 

On single-connection sessions, yes, there's additional non-determinism
on the target.  But keep in mind though that the target can at best 
send a NOP to prompt the initiator to retransmit, even if it knows for
sure - it can not definitively communicate its knowledge about the 
missing to the initiator.

On multi-connection sessions, nothing really changes.
-- 
Mallikarjun 


Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions Organization
MS 5668	Hewlett-Packard, Roseville.
cbm@rose.hp.com


> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> Main Office (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> Home Office (408) 997-6136, Cell: (408) 499-9702
> Internet address: hufferd@us.ibm.com
> 
> "Mallikarjun C." <cbm@rose.hp.com> on 11/08/2001 05:28:21 PM
> 
> To:   John Hufferd/San Jose/IBM@IBMUS
> cc:   <ips@ece.cmu.edu>
> Subject:  Re: iSCSI: Out of order commands
> 
> John,
> 
> Sorry, this note got a little longer than I would've liked, but....
> 
> I believe there are cases where OOO CmdSN handling is a
> legitimate requirement on targets due to exception events -
>     a) retransmitting a CmdSN on a command acknowledgement
>       timeout (within-connection recovery class).  This manifests as
>       an OOO CmdSN on the connection to a target if it didn't see
>       the original copy due to a digest error.
>     b) retransmitting the last few "lost" commands due to a connection
>       failure on a new connection. If this new connection had already
>       carried a CmdSN greater than these retransmitted commands
>       (prior to connection failure), this again manifests as OOO CmdSN
>       on the new connection to the target.
> 
> OTOH, I believe sending OOO CmdSNs on a connection as a
> regular practice is counterproductive, since the target must continuously
> re-order the initiator "optimization" leading to a zero-sum game.  I
> would argue that the need to dispatch CmdSNs OOO due to immediate
> data DMA (brought up by Rod) can be addressed by simple NIC
> changes to prefetch data for the next command (or more simply use
> unsolicited separate data PDUs, if negotiated).  [ You got to deal
> with the case of all commands being writes anyway! ]
> 
> If we allow OOO CmdSNs on a connection (I'd advocate discouraging
> it as a regular practice), I don't believe any of the stuff in error
> recovery
> breaks (nor does it affect the current reliance on ExpCmdSN).  Julian
> perhaps can comment.
>     - All the in-order assumptions are for DataSNs/R2TSNs/StatSNs, not
>        for CmdSNs.
>     - Any multi-connection session by definition must deal with OOO CmdSNs.
>     - I belive that the current abort task scheme for immediate commands
>       detailed in section 9.3 caters to OOO CmdSNs on a connection
>       as well (we must be dealing with an immediate Abort arriving before
>       the command today, since the command could have been hit with
>       a digest error).
> 
> To summarize, here is what I suggested to Julian in a private email -
> 
> a)I suggest using a SHOULD for in-order dispatch of
>   commands on a connection - for an initiator.
> 
> b)I suggest using a SHALL handle out-of-order commands
>   on a connection - for the target (as Barry pointed out).
> 
> Hope that was useful.
> --
> Mallikarjun
> 
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> Hewlett-Packard MS 5668
> Roseville CA 95747
> 
> ----- Original Message -----
> From: "John Hufferd" <hufferd@us.ibm.com>
> To: <cbm@rose.hp.com>
> Cc: <ips@ece.cmu.edu>
> Sent: Thursday, November 08, 2001 1:40 PM
> Subject: Re: iSCSI: Out of order commands
> 
> >
> > Mallikarjun,
> > Could you comment on the concept of OOO on the ErrorRecoveryLevel>0.  I
> had
> > thought that "in order delivery" was part of the detection of missing
> PDUs
> > and needed for timely Recovery.  I was wondering if this changes the way
> we
> > would use the ExpCmdSN, etc.
> >
> > I think your opinions on this part of the OOO discussion would be
> valuable.
> > For example, how would you contrast the differences in detecting a
> problem
> > and recovering from that problem etc., today vrs the OOO approach (if
> any).
> >
> >
> > .
> > .
> > .
> > John L. Hufferd
> > Senior Technical Staff Member (STSM)
> > IBM/SSG San Jose Ca
> > Main Office (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
> > Home Office (408) 997-6136, Cell: (408) 499-9702
> > Internet address: hufferd@us.ibm.com
> >
> >
> > "Mallikarjun C." <cbm@rose.hp.com>@ece.cmu.edu on 11/07/2001 09:41:05 AM
> >
> > Please respond to cbm@rose.hp.com
> >
> > Sent by:  owner-ips@ece.cmu.edu
> >
> >
> > To:   Santosh Rao <santoshr@cup.hp.com>, ips@ece.cmu.edu
> > cc:
> > Subject:  Re: iSCSI: Out of order commands
> >
> >
> >
> > Santosh,
> >
> > I have only one comment on your responses.
> >
> > > Even a single connection target *MUST* implement a scoreboard. The
> > > reason being that it can see out-of-order arrival of commands due to
> > > commands being dropped on digest errors. In such a case, it must block
> > > further command processing until holes are filled.
> >
> > I made two convenient assumptions if you noticed, :-), one of which
> > is that target forces session recovery on *any* error that it sees
> > (ErrorRecoveryLevel=0) - including a dropped command due to a digest
> > error.  With that assumption, a target can afford not to implement
> > a scoreboard.
> >
> > As I said in a private note, I guess what primarily bothers me about
> > OOO commands on a connection is that it requires the receiver to
> > undo this "optimization" on its end - most notably on a single
> > connection.  TCP experts may comment on how/if they dealt with a
> > similar issue.
> >
> > OTOH, you had some valid comments on exceptions to ordering during
> > connection recovery.  Perhaps we can move on by making Julian's
> > proposed stipulation a SHOULD....
> > --
> > Mallikarjun
> >
> >
> > Mallikarjun Chadalapaka
> > Networked Storage Architecture
> > Network Storage Solutions Organization
> > MS 5668   Hewlett-Packard, Roseville.
> > cbm@rose.hp.com
> >
> >
> > Santosh Rao wrote:
> > >
> > > Mallikarjun,
> > >
> > > Some comments below.
> > >
> > > Regards,
> > > Santosh
> > >
> > > "Mallikarjun C." wrote:
> > > >
> > > > Rod and Julian,
> > > >
> > > > This has been an interesting thread of discussion.  Some
> > > > comments -
> > > >
> > > > 1.My first reaction was - allowing out-of-order command
> > > >   transmission on the same connection deprives targets of
> > > >   an implementation choice.  Targets which support only
> > > >   single-connection sessions and only support session
> > > >   recovery (reasonable assumptions in my mind) can no
> > > >   longer afford *not to* implement a command scoreboard.
> > >
> > > Even a single connection target *MUST* implement a scoreboard. The
> > > reason being that it can see out-of-order arrival of commands due to
> > > commands being dropped on digest errors. In such a case, it must block
> > > further command processing until holes are filled.
> > >
> > > Thus, there is no getting away from implementing a sequencer at the
> > > target. Given this, I think it is unreasonable to restrict initiator
> > > implementation flexibility by imposing a strict ordering requirement
> > > within the connection.
> > >
> > > > 2.Any end-node efficiency that is sought to be achieved
> > > >   by transmitting CmdSNs out-of-order from the initiator
> > > >   would be lost on the other end-node, since the target
> > > >   now must wait for re-ordering the commands.
> > >
> > > It has to handle this situation anyway to deal with holes caused by
> > > digest errors. This scenario occurs even with initiators that issue
> > > commands in order.
> > >
> > > >
> > > > 3.The flipside is that out-of-order transmission saves
> > > >   link badwidth (albeit at the expense of end-node efficiency),
> > > >   compared to idling the link waiting for outbound DMA.
> > > >   We have to determine if this is a reasonable trade-off.
> > > >
> > > > 4.I can see Rod's point that prefetching all immediate
> > > >   data can be a burden on the NIC resources.  But, two
> > > >   questions -
> > > >         - could the NIC not use unsolicited separate data
> > > >           PDUs in these cases? [ I realize that InitialR2T
> > > >           has to be "no" to let it happen... ]
> > > >         - could the NIC have a memory architecture that
> > > >           allows data prefetching for the next command (so
> > > >           this is a non-issue from the protocol perspective)?
> > > >           This scheme incurs one DMA delay for every new
> > > >           burst of commands.
> > > >
> > > > 5.Another (perhaps radical at this point) option is to do
> > > >   away with immediate unsolicited data, to stick only with
> > > >   separate unsolicited data.  I would personally be okay
> > > >   with the choice, particularly if this feature (that
> > > >   helps software implementations) starts making hardware
> > > >   design complicated/expensive.
> > > >
> > > > So, to summarize -
> > > >
> > > > option                         immediate         allow
> > > >                                data in spec?     out-of-order?
> > > >
> > > > (A) (5) above                  no                no
> > > > (B) No real reason to do this. no                yes
> > > > (C) (4) above                  yes               no
> > > > (D) pros & cons (1), (2) & (3) yes               yes
> > > >
> > > > >From the arguments I heard so far, I am leaning towards
> > > > option A, and option C in that order.
> > > >
> > > > Comments?
> > > > --
> > > > Mallikarjun
> > > >
> > > > Mallikarjun Chadalapaka
> > > > Networked Storage Architecture
> > > > Network Storage Solutions Organization
> > > > MS 5668 Hewlett-Packard, Roseville.
> > > > cbm@rose.hp.com
> > > >
> > > > Rod Harrison wrote:
> > > > >
> > > > > Julian,
> > > > >
> > > > >         I don't understand what you are proposing here, what do you
> > mean by
> > > > > "multiplexed" DMA?
> > > > >
> > > > >         The problem is that the DMAs take some time, the more there
> > are
> > > > > queued the longer the last DMAs queued take to complete. Some
> > commands
> > > > > require DMAs to complete before they can be sent, i.e. Writes with
> > > > > immediate data, some commands do not, i.e. Reads and writes with no
> > > > > immediate data. The iSCSI HBA wants to be able to send commands as
> > > > > soon a possible, which for a read after a write can be before the
> > > > > write's DMA has completed. Maintaining an ordered queue for
> commands
> > > > > to be sent on the HBA is expensive and redundant since the target
> > > > > already knows how to queue commands before committing them to its
> > SCSI
> > > > > layer.
> > > > >
> > > > >         The iSCSI HBA and its host driver are not at liberty to
> > change the
> > > > > order of commands from the OS, but the DMAs those commands need are
> > > > > unlikely to complete in the same order, and as I mentioned some
> > > > > commands need no DMA. If the HBA can't send commands out of CmdSN
> > > > > order it has to maintain an ordered queue of commands waiting to be
> > > > > sent, and potentially buffer a lot of data. For an HBA this makes
> > > > > immediate data almost impossible to support.
> > > > >
> > > > >         I don't see the problem with allowing out of order commands
> > given
> > > > > that the target already has to deal with very similar problems. I
> > > > > think we are getting in to the area of implementation choices here,
> > > > > which is inappropriate for a specification.
> > > > >
> > > > >         - Rod
> > > > >
References:
- Re: iSCSI: Out of order commands
  - From: "John Hufferd" <hufferd@us.ibm.com>
Prev by Date: Re: iSCSI: Out of order commands
Next by Date: RE: iSCSI: IPsec tunnel / transport mode decision
Prev by thread: Re: iSCSI: Out of order commands
Next by thread: RE: iSCSI: Out of order commands
Index(es):
- Date
- Thread
Home
Last updated: Fri Nov 09 16:17:36 2001
7705 messages in chronological order