SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: iSCSI: Flow Control



    Matt,
    
    > "GUPTA,SOMESH (HP-Cupertino,ex1)" wrote:
    >
    > > Hi all,
    > >
    > > Assuming that we have consensus, especially on [1] below (minimum
    > > connections is 1), I think we should try and resolve the flow
    > > control issue.
    > >
    > > It seems to me that there is sufficient consensus that command
    > > flow control is needed -
    > >
    > > [1]   To enable fastest possible flow of commands given the
    > >       capabilities of the target & initiator, and accomodating
    > >       increased latencies of IP networks
    >
    > Well, the "fastest" possible flow of commands would be to send
    > the commands on
    > a dedicated command channel.  Otherwise, they will always be
    > queued up behind
    > data.
    
    As there is a finite amount of bandwidth on a wire, a connection does not
    ensure empty transmit queues.  Just the opposite would be true as data could
    be placed well ahead of any commands within the TCP transport.  A less used
    connection is likely to have even less bandwidth. If you wish to ensure
    command performance, not flooding the transmit buffer would be the needed
    strategy.
    
    > > [2]   To significantly minimize the queue full condition. And to
    > >       provide a recovery mechanism at the iSCSI level when command
    > >       overflow happens at the target.
    >
    > If the "MaxCmdRN" mechanism that is already implemented in the draft is
    > observed, there will be no "dropped commands" because the target
    > has indicated
    > how many command buffers it has available (as long as the target
    > doesn't "lie"
    > and the initiator doesn't ignore the target's values).
    
    As this is an end to end control, the aggregation of these controls must
    still ensure the medium buffers do not overfill or become stale.
    
    > > [4] Another question that comes up is - Should the credit be per
    > >     connection or per session (multiple connections)?
    > >
    > > The current draft does provide for a session wide "flow control"
    > > through MaxCmdRn.
    >
    > This is simply an artifact of the real purpose of the CmdRN fields... to
    > enable re-ordering of commands at the target across multiple
    > (symetric) iSCSI
    > TCP connections.
    >
    > > I believe that it is better to have flow control on a per
    > connection basis.
    
    Allowing a TCP buffer to overfill is not a good control scheme as we
    discussed in a prior message.  If you wish a control scheme, it should be
    explicit.  The level of this control should ensure each medium buffer does
    not overfill or become stale.
    
    > Does this mean you now no longer care about command ordering?
    >
    > > This enables each connection (which might be different NICs) to operate
    > > independently of
    > > each other. Having a session wide flow control would cause
    > > sync points in both the initiator and the target.
    >
    > If you want such independence, why not simply use multiple iSCSI
    > sessions and
    > use the wedge driver as others have stated?
    
    As a wedge driver is not part of the standard, it should not be seen as a
    bromide solution.  To create a standard, there should not be underlying
    reliance on vendor unique solutions.
    
    > > Also a smaller field could be used if it is just to indicate
    > > a credit window.
    > >
    > > [5] The credit should be a "pretty good effort" and not a "guarantee".
    > >
    > > This allows smart targets to overcommitt as the number of initiators
    > > logged in increases (while reducing the credit available to the
    > > initiators) and increase the credit and reduce overcommittment as
    > > the number of initiators logged in decreases.
    > >
    > > Some mechanism is required to recover from the infrequent case where
    > > command buffers get exhausted and have to be thrown away.
    > >
    > > [6] I would recommend that iSCSI provide a way to recover from
    > > command overflow and also maintain ordering.
    > >
    > > The current proposal does not have a drop notification. It has
    > > an ack mechanism (ExpCmdRn).
    >
    > And this mechanism tells you what commands got to the target.  If
    > the command
    > didn't get to the target, you would know by the ExpCmdRn.  Remember, TCP
    > always delivers (bytes) in order, so if command x didn't make it,
    > neither did
    > all the commands after x.
    
    Should there be a transport that allows out of sequence delivery, this flow
    control scheme would then be incompatible.  Each command need not be
    executed by the device in sequence nor will the response be returned in
    sequence.
    
    > > I think for the purpose of drop
    > > notification, it is better to be able to indicate the range of
    > > commands dropped. TCP acks do tell me which commands
    > > reached the target,
    >
    > No, TCP acks tell you nothing, because iSCSI sends the commands to the TCP
    > layer in the TCP byte stream, and TCP does not tell the
    > application layer what
    > bytes have been "acked".
    >
    > > and command responses tell me which were processed.
    >
    > If the "MaxCmdRN" mechanism is observed, there will be no
    > "dropped commands"
    > because the target has indicated how many command buffers it has
    > available.
    
    There will be times when the device drops commands.
    
    > > When a target suffers from command exhaustion, it could behave
    > > in 2 different ways - one is to drop all the commands it receives
    > > till it detects a retransmission. In this case it would send a drop
    > > notification of all commands it receives till it starts receiving
    > > the command from where the drop started.
    > >
    > > The other would be to store all the commands it is able to provide
    > > buffers for and provide NAKs for only those that it has dropped.
    > > This would be more efficient.
    > >
    > > In this case, we should also agree on what the semantics of the
    > > processing of the out or order commands are. Should they be
    > > processed only when the gaps are filled? Or can they be processed
    > > in any order?
    > >
    > > [7] There was some discussion of whether we should propose a slow
    > > start algorithm or a fast start algorithm.
    > >
    > > I think we should a fast start algorithm at this level. At TCP
    > > level, the slow start algorithm is important because the two
    > > ends are unaware of the state of the network and have to probe it.
    > > At the iSCSI level, the target should be reasonably knowledgable
    > > about the its own state and be able to provide a credit or
    > > reduce/increase it per login as the conditions change (hopefully
    > > with some hysteresis built in).
    > >
    > > [8] On flow control of immediate data, should we first work out
    > > the command flow control and then turn our efforts to the
    > > data flow control?
    >
    > Once again, if the asymetric model is used, with a minimum of two TCP
    > connections, there is no command flow control problem.  There is
    > no command
    > ordering problem.  There is no data flow control problem.  All
    > commands will
    > flow on one TCP connection.  When the command buffers at the target become
    > full, the target will simply let TCP flow control itself.  If there are no
    > data buffers at the target, the target will again simply let the TCP flow
    > control mechanism kick in.
    
    The TCP buffers are not independent.  Connections will interact.  At 1G-bit,
    the amount of data in flight is 1M-Byte per 1K miles of WAN and for
    feed-back control, double this number.  Control via buffer limits?  I think
    something a bit more explicit would be required.
    
    Doug
    
    > -Matt
    >
    > >
    > >
    > > Once we can agree on some of the basic issues, then it should be
    > > relatively easy to work out the credit indication/numbering
    > > details etc.
    > >
    > > Somesh
    >
    >
    
    


Home

Last updated: Tue Sep 04 01:06:44 2001
6315 messages in chronological order