|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: iSCSI: Flow ControlJust to clarify on point [7], I was not referring to TCP flow control, but iSCSI command flow control. I thought I saw someone proposing a slow start sort of command flow control mechanism. I am a proponent of using TCP as is without any changes (at least in the first phase). Somesh > -----Original Message----- > From: Douglas Otis [mailto:dotis@sanlight.net] > Sent: Monday, October 09, 2000 11:14 AM > To: GUPTA,SOMESH (HP-Cupertino,ex1); IPS@ece.cmu.edu > Subject: RE: iSCSI: Flow Control > > > Somesh, > > > Hi all, > > > > Assuming that we have consensus, especially on [1] below (minimum > > connections is 1), I think we should try and resolve the flow > > control issue. > > > > It seems to me that there is sufficient consensus that command > > flow control is needed - > > > > [1] To enable fastest possible flow of commands given the > > capabilities of the target & initiator, and accomodating > > increased latencies of IP networks > > The credit scheme as I have recommended would be carried > within each frame > an not just within the response PDU to reduce latency of control. > > > [2] To significantly minimize the queue full condition. And to > > provide a recovery mechanism at the iSCSI level when command > > overflow happens at the target. > > The initiator should regulate the number of outstanding > commands. This > regulation will not impact performance at the device level. > > > [3] Some of the debate seems to be around whether the > credit mechanism > > should be static or dynamic. > > The credit scheme that I have recommended would be dynamic. > > > I believe that static is a subset of > > dynamic (where you never change the value being advertised). I don't > > disagree with Charles when he says that it will take experimentation > > to determine how to best adjust the credit dynamically. However, > > it is important to provide for it in the protocol so that when a > > vendor does figure out how best to adjust the credit, they have a > > protocol mechanism to do so. Even though it is an implementation > > that provides full rate performance, it is the protocol that > > enables it (take TCP window scaling option e.g.). > > > > [4] Another question that comes up is - Should the credit be per > > connection or per session (multiple connections)? > > As the transport's primary function is to provide aggregation > down to the > medium, then it would not be either on the connection, nor > the end point as > it is now. It should be at the medium as recommended. > > > The current draft does provide for a session wide "flow control" > > through MaxCmdRn. I believe that it is better to have flow > > control on a per connection basis. This enables each connection > > (which might be different NICs) to operate independently of > > each other. Having a session wide flow control would cause > > sync points in both the initiator and the target. > > > > Also a smaller field could be used if it is just to indicate > > a credit window. > > The credit window should not be carried per connection as you > suggest. The > medium is what needs to be controlled. > > > [5] The credit should be a "pretty good effort" and not a > "guarantee". > > > > This allows smart targets to overcommitt as the number of initiators > > logged in increases (while reducing the credit available to the > > initiators) and increase the credit and reduce overcommittment as > > the number of initiators logged in decreases. > > > > Some mechanism is required to recover from the infrequent case where > > command buffers get exhausted and have to be thrown away. > > As the credit scheme that I recommended provides the highest > resolution of > control as well as implements a reduction acknowledgement, > there should be > little reason to toss commands or frames. > > > [6] I would recommend that iSCSI provide a way to recover from > > command overflow and also maintain ordering. > > > > The current proposal does not have a drop notification. It has > > an ack mechanism (ExpCmdRn). I think for the purpose of drop > > notification, it is better to be able to indicate the range of > > commands dropped. TCP acks do tell me which commands > > reached the target, and command responses tell me which > were processed. > > > > When a target suffers from command exhaustion, it could behave > > in 2 different ways - one is to drop all the commands it receives > > till it detects a retransmission. In this case it would send a drop > > notification of all commands it receives till it starts receiving > > the command from where the drop started. > > If the initiator restricts commands, then there would never be a drop > requirement. In addition, such limit on outstanding commands does not > represent a practical constraint on performance. > > > The other would be to store all the commands it is able to provide > > buffers for and provide NAKs for only those that it has dropped. > > This would be more efficient. > > > > In this case, we should also agree on what the semantics of the > > processing of the out or order commands are. Should they be > > processed only when the gaps are filled? Or can they be processed > > in any order? > > As TCP does not provide for out of sequence processing, there > is little > concern within this transport. Only when substantial buffers > are remaining, > would out of sequence processing become useful. As these > buffers should be > at the device, and as such handling is already defined at the > device, no > further definitions are required. > > > [7] There was some discussion of whether we should propose a slow > > start algorithm or a fast start algorithm. > > > > I think we should a fast start algorithm at this level. At TCP > > level, the slow start algorithm is important because the two > > ends are unaware of the state of the network and have to probe it. > > At the iSCSI level, the target should be reasonably knowledgable > > about the its own state and be able to provide a credit or > > reduce/increase it per login as the conditions change (hopefully > > with some hysteresis built in). > > This is not TCP. Why use TCP if you wish to modify TCP? Resist > re-engineering TCP. On a LAN, this is not a problem and on a > WAN, this is a > required feature of TCP. > > > [8] On flow control of immediate data, should we first work out > > the command flow control and then turn our efforts to the > > data flow control? > > > > Once we can agree on some of the basic issues, then it should be > > relatively easy to work out the credit indication/numbering > > details etc. > > To adapt to different flow control schemes, the encapsulation > should be a > separate documentation from flow control and have flow > control either as a > separate control PDU or as a prefix defined within the > flow-control draft. > This would remove the load on having one person define > everything and allow > the control mechanism to change without damaging > encapsulation. I would add > that service management should also have the same split in documents. > > Doug > > > > > > Somesh > > > > > -----Original Message----- > > > From: Black_David@emc.com [mailto:Black_David@emc.com] > > > Sent: Wednesday, October 04, 2000 5:13 PM > > > To: ips@ece.cmu.edu > > > Subject: iSCSI sessions: Step 2 > > > > > > > > > With my WG co-chair hat on, it's time to call > > > consensus on some of this ... > > > > > > Late last week, I sent the "Let's try again" message > > > on iSCSI sessions, and since then I've only seen > > > one thread of comments to it from a combination of > > > Matt Wakeley and Doug Otis. The important content > > > of that thread is Matt renewing his position that > > > more than one connection ought to be REQUIRED. Lest > > > this seem like annoyance, Matt deserves credit for > > > being patient with the WG's indirect progress towards > > > consensus that made it necessary for him to renew his > > > objection on multiple occasions. As I read Matt's > > > email, it looks like a good flow control solution > > > for the single TCP connection iSCSI session case > > > might satisfy him, but the flow control discussion > > > is still ongoing. > > > > > > In any case, I am stating the following two items > > > as WG rough consensus, over Matt's renewed objection > > > in the first case: > > > > > > [1] Multiple TCP connections per iSCSI session > > > remain OPTIONAL. > > > [2] Multiple TCP connections per iSCSI session > > > will be specified as part of the base > > > iSCSI protocol. > > > > > > Given that it's two months after the Pittsburgh meeting > > > I hope the rough consensus will hold on these items; > > > anyone other than Matt should object to me directly, > > > if necessary, I'll (reluctantly) reopen these issues > > > one more time (yes, this is a hint). > > > > > > Moving on to the topic of models for multiple connection > > > sessions, let me start by trying to winnow the approaches > > > to Asymmetric sessions before taking up Asymmetric vs. > > > Symmetric again. Four approaches to Asymmetric sessions > > > have been discussed. I have not seen anyone other than > > > Pierre Labat support his Balanced model in which a single > > > stream of control moves from TCP connection to TCP connection > > > within a session. Therefore I believe it is the WG > > > rough consensus that: > > > > > > [3] The Balanced Asymmetric model in which a single > > > control stream moves from TCP connection to TCP > > > connection in an iSCSI session will not be pursued. > > > > > > Similarly, I saw no objections to the note at the end of > > > Julian's email, indicating that the Collapsed Asymmetric > > > model in which data is allowed on the command connection > > > even when there are multiple TCP connections in an iSCSI > > > session is technically inferior to both the Pure Asymmetric > > > and Symmetric models. Therefore I believe it is the WG > > > rough consensus that: > > > > > > [4] The Collapsed Asymmetric model in which data is allowed > > > on the command connection in multiple connection > > > iSCSI sessions will not be pursued. > > > > > > The Pure Asymmetric model was originally described as > > > requiring two TCP connections per session. Kalman Meth > > > proposed a modification to it that allowed it to use a > > > single connection for both command and data. Between > > > Kalman being the originator of the Pure Asymmetric model, > > > lack of objection to his proposal, and rough consensus [2] > > > above, I believe it to be the WG rough consensus that: > > > > > > [5] The Pure Asymmetric model will only be considered > > > in the modified form that allows an iSCSI session > > > to contain a single TCP connection on which both > > > command and data flow. > > > > > > If all five of the above consensuses (consensii?) hold, > > > that would be serious progress. Objections to these > > > should be sent to the list, except that I would ask > > > Pierre Labat not to object to [3] in the absence of > > > other objections to it. > > > > > > Now comes the hard part - Symmetric vs. modified > > > Pure Symmetric (modified by [5] above). There are > > > over 1000 email messages in my mailbox for the ips > > > mailing list for the past two months, and I freely > > > admit to not having reviewed them in detail. I suggested > > > in the "Let's try again" email that more weight should > > > be given to those working on implementations, especially > > > hardware, and have not seen any objections to that > > > suggestion. My impression is that the opinion of such > > > people has been in favor of the Symmetric model - > > > Matt Wakeley (Agilent), and Somesh Gupta (HP) come > > > to mind as examples. I'm not confident that this is > > > the WG consensus, but it appears to me that the > > > WG is headed in that direction. Please comment on > > > this - the absence of comments/objections will be > > > taken as a sign of agreement. > > > > > > There has been no comment on the error recovery issue > > > since my email. Given this and the prior statements that > > > TCP solves many of the tape error scenarios that are motivating > > > FCP error recovery, I think the authors of the next version > > > of the iSCSI draft are entitled to use their best technical > > > judgement in determining how much error recovery to specify > > > across multiple TCP connections in an iSCSI session, and > > > the WG will review it when the next version of the draft > > > appears. > > > > > > We might be getting close to the end of the session issues. > > > Carefully considered comments are encouraged, but I'd ask > > > everyone to consider their comments carefully before sending > > > them, given our past experiences with this set of issues. > > > > > > Thanks, > > > --David > > > > > > --------------------------------------------------- > > > David L. Black, Senior Technologist > > > EMC Corporation, 42 South St., Hopkinton, MA 01748 > > > +1 (508) 435-1000 x75140 FAX: +1 (508) 497-8500 > > > black_david@emc.com Mobile: +1 (978) 394-7754 > > > --------------------------------------------------- > > > > > >
Home Last updated: Tue Sep 04 01:06:45 2001 6315 messages in chronological order |