SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI: Out of order commands



    Santosh,
    
    I have only one comment on your responses.
    
    > Even a single connection target *MUST* implement a scoreboard. The
    > reason being that it can see out-of-order arrival of commands due to
    > commands being dropped on digest errors. In such a case, it must block
    > further command processing until holes are filled.
    
    I made two convenient assumptions if you noticed, :-), one of which
    is that target forces session recovery on *any* error that it sees 
    (ErrorRecoveryLevel=0) - including a dropped command due to a digest 
    error.  With that assumption, a target can afford not to implement 
    a scoreboard.
    
    As I said in a private note, I guess what primarily bothers me about 
    OOO commands on a connection is that it requires the receiver to
    undo this "optimization" on its end - most notably on a single 
    connection.  TCP experts may comment on how/if they dealt with a 
    similar issue. 
    
    OTOH, you had some valid comments on exceptions to ordering during 
    connection recovery.  Perhaps we can move on by making Julian's 
    proposed stipulation a SHOULD....
    -- 
    Mallikarjun 
    
    
    Mallikarjun Chadalapaka
    Networked Storage Architecture
    Network Storage Solutions Organization
    MS 5668	Hewlett-Packard, Roseville.
    cbm@rose.hp.com
    
    
    Santosh Rao wrote:
    > 
    > Mallikarjun,
    > 
    > Some comments below.
    > 
    > Regards,
    > Santosh
    > 
    > "Mallikarjun C." wrote:
    > >
    > > Rod and Julian,
    > >
    > > This has been an interesting thread of discussion.  Some
    > > comments -
    > >
    > > 1.My first reaction was - allowing out-of-order command
    > >   transmission on the same connection deprives targets of
    > >   an implementation choice.  Targets which support only
    > >   single-connection sessions and only support session
    > >   recovery (reasonable assumptions in my mind) can no
    > >   longer afford *not to* implement a command scoreboard.
    > 
    > Even a single connection target *MUST* implement a scoreboard. The
    > reason being that it can see out-of-order arrival of commands due to
    > commands being dropped on digest errors. In such a case, it must block
    > further command processing until holes are filled.
    > 
    > Thus, there is no getting away from implementing a sequencer at the
    > target. Given this, I think it is unreasonable to restrict initiator
    > implementation flexibility by imposing a strict ordering requirement
    > within the connection.
    > 
    > > 2.Any end-node efficiency that is sought to be achieved
    > >   by transmitting CmdSNs out-of-order from the initiator
    > >   would be lost on the other end-node, since the target
    > >   now must wait for re-ordering the commands.
    > 
    > It has to handle this situation anyway to deal with holes caused by
    > digest errors. This scenario occurs even with initiators that issue
    > commands in order.
    > 
    > >
    > > 3.The flipside is that out-of-order transmission saves
    > >   link badwidth (albeit at the expense of end-node efficiency),
    > >   compared to idling the link waiting for outbound DMA.
    > >   We have to determine if this is a reasonable trade-off.
    > >
    > > 4.I can see Rod's point that prefetching all immediate
    > >   data can be a burden on the NIC resources.  But, two
    > >   questions -
    > >         - could the NIC not use unsolicited separate data
    > >           PDUs in these cases? [ I realize that InitialR2T
    > >           has to be "no" to let it happen... ]
    > >         - could the NIC have a memory architecture that
    > >           allows data prefetching for the next command (so
    > >           this is a non-issue from the protocol perspective)?
    > >           This scheme incurs one DMA delay for every new
    > >           burst of commands.
    > >
    > > 5.Another (perhaps radical at this point) option is to do
    > >   away with immediate unsolicited data, to stick only with
    > >   separate unsolicited data.  I would personally be okay
    > >   with the choice, particularly if this feature (that
    > >   helps software implementations) starts making hardware
    > >   design complicated/expensive.
    > >
    > > So, to summarize -
    > >
    > > option                         immediate         allow
    > >                                data in spec?     out-of-order?
    > >
    > > (A) (5) above                  no                no
    > > (B) No real reason to do this. no                yes
    > > (C) (4) above                  yes               no
    > > (D) pros & cons (1), (2) & (3) yes               yes
    > >
    > > >From the arguments I heard so far, I am leaning towards
    > > option A, and option C in that order.
    > >
    > > Comments?
    > > --
    > > Mallikarjun
    > >
    > > Mallikarjun Chadalapaka
    > > Networked Storage Architecture
    > > Network Storage Solutions Organization
    > > MS 5668 Hewlett-Packard, Roseville.
    > > cbm@rose.hp.com
    > >
    > > Rod Harrison wrote:
    > > >
    > > > Julian,
    > > >
    > > >         I don't understand what you are proposing here, what do you mean by
    > > > "multiplexed" DMA?
    > > >
    > > >         The problem is that the DMAs take some time, the more there are
    > > > queued the longer the last DMAs queued take to complete. Some commands
    > > > require DMAs to complete before they can be sent, i.e. Writes with
    > > > immediate data, some commands do not, i.e. Reads and writes with no
    > > > immediate data. The iSCSI HBA wants to be able to send commands as
    > > > soon a possible, which for a read after a write can be before the
    > > > write's DMA has completed. Maintaining an ordered queue for commands
    > > > to be sent on the HBA is expensive and redundant since the target
    > > > already knows how to queue commands before committing them to its SCSI
    > > > layer.
    > > >
    > > >         The iSCSI HBA and its host driver are not at liberty to change the
    > > > order of commands from the OS, but the DMAs those commands need are
    > > > unlikely to complete in the same order, and as I mentioned some
    > > > commands need no DMA. If the HBA can't send commands out of CmdSN
    > > > order it has to maintain an ordered queue of commands waiting to be
    > > > sent, and potentially buffer a lot of data. For an HBA this makes
    > > > immediate data almost impossible to support.
    > > >
    > > >         I don't see the problem with allowing out of order commands given
    > > > that the target already has to deal with very similar problems. I
    > > > think we are getting in to the area of implementation choices here,
    > > > which is inappropriate for a specification.
    > > >
    > > >         - Rod
    > > >
    > > > -----Original Message-----
    > > > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
    > > > Julian Satran
    > > > Sent: Monday, November 05, 2001 10:06 PM
    > > > To: ips@ece.cmu.edu
    > > > Subject: Re: iSCSI: Out of order commands, was current UNH Plugfest
    > > >
    > > > Rod,
    > > >
    > > > I don't see any reason why DMA operations cant be "multiplexed" with
    > > > commands.
    > > > If you have scheduled a long outbound DMA you are doomed regardless of
    > > > the
    > > > command ordering.
    > > > And if you have scheduled DMA operations piecemeal then you can insert
    > > > your commands in correct order.
    > > >
    > > > Julo
    > > >
    > > > "Rod Harrison" <rod.harrison@windriver.com>
    > > > 05-11-01 20:48
    > > > Please respond to "Rod Harrison"
    > > >
    > > >         To:     Julian Satran/Haifa/IBM@IBMIL, <ips@ece.cmu.edu>
    > > >         cc:
    > > >         Subject:        iSCSI: Out of order commands, was current UNH
    > > > Plugfest
    > > >
    > > >                  [ Subject changed ]
    > > >
    > > > Julian,
    > > >
    > > >                  The ordering difference is introduced between the
    > > > host
    > > > side driver
    > > > and the iSCSI HBA. The host side driver must present SCSI commands to
    > > > the HBA in the order they are received from the OS to prevent read
    > > > after write dependency failures. The HBA might reorder the commands
    > > > depending on when DMA completes. The reordering can't be done ahead of
    > > > time in the host driver since it doesn't know how long each DMA might
    > > > take. As long as the HBA assigns CmdSN in the order it receives
    > > > commands the desired host ordering is preserved.
    > > >
    > > >                  - Rod
    > > >
    > > > -----Original Message-----
    > > > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
    > > > Julian Satran
    > > > Sent: Monday, November 05, 2001 12:35 AM
    > > > To: ips@ece.cmu.edu
    > > > Subject: RE: iSCSI: current UNH Plugfest
    > > >
    > > > Rod,
    > > >
    > > > I all examples give the point I find hard to understand is why is the
    > > > ordering on the wire different from the presentation order to the
    > > > initiator.  You can get as many overlaps as you want by presenting the
    > > > commands to the initiator in the desired order.
    > > > What we are considering here is the case in which you want to ship in
    > > > an
    > > > order different than the one you present the commands.
    > > >
    > > > Julo
    > > >
    > > > "Rod Harrison" <rod.harrison@windriver.com>
    > > > Sent by: owner-ips@ece.cmu.edu
    > > > 04-11-01 04:42
    > > > Please respond to "Rod Harrison"
    > > >
    > > >         To:     "Barry Reinhold" <bbrtrebia@mediaone.net>, "Dave
    > > > Sheehy"
    > > > <dbs@acropora.rose.agilent.com>, "IETF IP SAN Reflector"
    > > > <ips@ece.cmu.edu>
    > > >         cc:
    > > >         Subject:        RE: iSCSI: current UNH Plugfest
    > > >
    > > > Barry,
    > > >
    > > >                  In general I agree but I don't think this is as much
    > > > of a
    > > > corner case
    > > > as it at first appears. Targets will have code very similar to that
    > > > needed to handle out of order commands to deal with digest errors.
    > > > Targets also need to queue commands whilst waiting for both solicited
    > > > and unsolicited data to arrive. Queuing out of order commands seems
    > > > little extra work.
    > > >
    > > >                  From an initiators point of view there are
    > > > efficiency,
    > > > and probably
    > > > performance gains to be had from sending commands out of order. Bob
    > > > Russell gave the example of a read being sent whilst write data DMA is
    > > > happening, and a similar situation can arise with DMA for writes
    > > > overtaking that of earlier writes if the initiator has multiple DMA
    > > > engines. In this case the initiator might be forced to let the wire go
    > > > idle if it can't send the data from completed DMAs as soon as
    > > > possible.
    > > >
    > > >                  We already have a command queue at the target to
    > > > enforce
    > > > correct
    > > > serialisation of commands, doing the same thing at the initiator is
    > > > redundant.
    > > >
    > > >                  Finally, I don't believe we should be writing a
    > > > standard
    > > > to work
    > > > around poor coding and test coverage, especially at the cost of
    > > > potential efficiency gains.
    > > >
    > > >                  I agree with Dave and Santosh that commands being
    > > > sent
    > > > out of order
    > > > on a single session should be allowed by the standard.
    > > >
    > > >                  - Rod
    > > >
    > > > -----Original Message-----
    > > > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
    > > > Barry Reinhold
    > > > Sent: Friday, November 02, 2001 5:24 PM
    > > > To: Dave Sheehy; IETF IP SAN Reflector
    > > > Subject: RE: iSCSI: current UNH Plugfest
    > > >
    > > > Using features such as out of order command delivery on a connection
    > > > tend to
    > > > be the sort of things that lead to interoperability problems. It is
    > > > unexpected and probably going to hit poorly tested code paths even if
    > > > the
    > > > standard is written to allow it.
    > > >
    > > > >-----Original Message-----
    > > > >From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf
    > > > Of
    > > > >Dave Sheehy
    > > > >Sent: Friday, November 02, 2001 4:19 PM
    > > > >To: IETF IP SAN Reflector
    > > > >Subject: Re: iSCSI: current UNH Plugfest
    > > > >
    > > > >
    > > > >
    > > > >> 3. Can commands be sent out of order on the same connection?
    > > > >>
    > > > >>    The behavior of targets is clearly specified in Section 2.2.2.3
    > > > on
    > > > >>    page 25 of draft 8, which says:
    > > > >>      "Except for the commands marked for immediate delivery the
    > > > iSCSI
    > > > >>      target layer MUST eliver the commands for execution in the
    > > > order
    > > > >>      specified by CmdSN."
    > > > >>
    > > > >>    Section 2.2.2.3 on page 26 of draft 8 also says:
    > > > >>      "- CmdSN - the current command Sequence Number advanced by 1
    > > > on
    > > > >>      each command shipped except for commands marked for immediate
    > > > >>      delivery."
    > > > >>    but the meaning of the term "shipped" is vague, and does not
    > > > >> necessarily
    > > > >>    require that the PDUs arrive on the other end of a TCP
    > > > connection
    > > > >>    in the same order that the CmdSN values were assigned to these
    > > > PDUs.
    > > > >>
    > > > >>    Some initiators have been designed to send commands out of CmdSN
    > > > >>    order on one connection.  Consider the situation where there is
    > > > only
    > > > >>    one connection and a high-level dispatcher creates a PDU for a
    > > > SCSI
    > > > >>    command that involves writing immediate data to the target.
    > > > This PDU
    > > > >>    is enqueued to a lower-level layer which has to setup, start,
    > > > and
    > > > >>    wait-for a DMA operation to move the immediate data into an
    > > > onboard
    > > > >>    buffer before the PDU can be put onto the wire.  While this is
    > > > >>    happening, the dispatcher creates another unrelated PDU for a
    > > > SCSI
    > > > >>    read command (for example), and when this PDU is passed to the
    > > > >>    lower-level layer it can be sent immediately, ahead of the
    > > > previous
    > > > >>    write PDU and therefore out of order on this connection.
    > > > >>
    > > > >>    The standard clearly allows this to happen if the two PDUs were
    > > > sent
    > > > >>    on different connections, and seems to imply that this can also
    > > > happen
    > > > >>    when the two PDUs are sent on the same connection.
    > > > >>
    > > > >>    The suggestion is to put in the standard an explicit statement
    > > > that
    > > > >>    this is allowed or not allowed, as appropriate.
    > > > >>
    > > > >>    If this is allowed, such a statement would avoid the erroneous
    > > > >>    assumption being made by some target implementers that within a
    > > > single
    > > > >>    connection, commands will arrive in order.
    > > > >>
    > > > >>    If this is not allowed, such a statement would avoid the
    > > > erroneous
    > > > >>    assumption being made by some initiator implementers that within
    > > > a
    > > > >>    single connection, commands can be put on the wire out of order.
    > > > >>
    > > > >> +++
    > > > >>
    > > > >> will add an explicit statement saying that this behaviour is
    > > > forbidden.
    > > > >> 2.2.2.1 will contain:
    > > > >>
    > > > >> On any given connection, the iSCSI initiator MUST send the
    > > > >commands in the
    > > > >> order specified by CmdSN.
    > > > >>
    > > > >> +++
    > > > >
    > > > >Why do you feel this behavior should be forbidden? Targets already
    > > > have to
    > > > >order commands across the session. I don't see why it's a problem to
    > > > extend
    > > > >that to the connection as well. I, for one, believe we should take
    > > > >a liberal
    > > > >stance on this.
    > > > >
    > > > >Dave Sheehy
    > > > >
    > 
    > --
    > ##################################
    > Santosh Rao
    > Software Design Engineer,
    > HP-UX iSCSI Driver Team,
    > Hewlett Packard, Cupertino.
    > email : santoshr@cup.hp.com
    > Phone : 408-447-3751
    > ##################################
    


Home

Last updated: Wed Nov 07 22:17:38 2001
7634 messages in chronological order