|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: iSCSI: Flow ControlPierre, Julian, YP, OK, Does anyone know what started all this? (not sure I want an answer.) Here is what I think I heard. Julian thinks the Draft Works as is and that he has enough Pacing stuff with normal TCP/IP and MaxCmdRn. YP and Pierre, have just gone into lots of detail on how there is no blocking if done right. And they seem happy. Now the way I translate that is, the Draft Works good enough for YP and Pierre, or they could not have described what they did. Julian should be happy, YP should be happy that some folks agree with his approach and that the Draft Works as is, and Pierre should also be happy because some folks agree with him. OK, what I think this means is that the Storage Controllers that have enough Memory to match with the flow of data coming in to the processing rate of the Disks behind them, should be "All Singing, and All Dancing". Storage Controllers that have the potential of more data coming in then they have memory to match the processing rate of the Disks behind them, will have a bit of a problem, and have to push back from time to time. They may not be as optimum at longer distances then they are at shorter distances. Anyway this always seemed obvious to me. So, what I hear you all saying is that, the Draft is Fine the way it is (at least with regarding to Pacing/Credits etc.) is that right? So the key guy that is perhaps out of the boat is Matt, who thinks that we should have at least two conversations per Session run in an Asymmetric manor. I have always thought that his point was valid, at least in the smaller memory targets, especially if there were standard NICs on the Sending Side. So I for one, would like to hear from Matt. Let me ask YP, and Pierre one other thing. How do you think your proposed design would operate, if the Sender was using, as Pierre calls it, "regular networking" and the target was using your proposed design. Is it still "All Singing and All Dancing"? . . . John L. Hufferd Senior Technical Staff Member (STSM) IBM/SSG San Jose Ca (408) 256-0403, Tie: 276-0403 Internet address: hufferd@us.ibm.com Pierre Labat <pierre_labat@hp.com>@ece.cmu.edu on 10/10/2000 04:04:37 PM Sent by: owner-ips@ece.cmu.edu To: ips@ece.cmu.edu cc: Subject: Re: iSCSI: Flow Control julian_satran@il.ibm.com wrote: > Pierre, > > You are wrong again. When the target reopens the window - i.e., reads some > data from the > pipe at his end you get to put your Read command - but it goes after the > rest of the window and > window can be several megabytes. Julian, The TCP window is not a buffer on the receive side. On the receive side, in our case (the target) and as far as TCP segments arrive in order, there is not an opaque FIFO containing a full window size of command/data waiting to be processed. You can avoid that. What the target does is: receive bytes through the TCP connection, does the TCP work and forms a iSCSI PDU. The maximum you have to store is a few TCP segments to re-build the PDU. As soon as the PDU is built it is processed. When the target wants to close the TCP window it updates accordingly the window and CONTINUEs to process the incoming PDUs. At that point you assume that the incoming PDUs are put in an opaque FIFO, but rather than that, the target can process them and put the data a the right location in the target cache. Then, when the window is opened again and the read PDU comes, it is processed immediately. In fact as Y P Cheng described in a previous mail in this thread, the model that can be used for iSCSI traffic is different of the common model we have for regular TCP/IP networking although a TCP fully complient with the RFCs can be used for iSCSI. In regular TCP/IP networking the application (on the transmit side) fills a FIFO that the adapter empties. In our case as explained by Y P Cheng you replace the FIFO by an "exchange table" what i called a flat array. It allows you to avoid the head of queue blocking at this level. On the receive side (the target in our case) in regular networking, the incoming data are tossed in a FIFO by TCP. The application empties this FIFO and can block (in this case the FIFO grows) and yes, when the application unblock, it has a large amount of PDUs to process. But in the model described the application never blocks. Hence there is no big receive opaque FIFO on the target. In our case the application is the module that process the iSCSI pdus. The application never blocks because it is able to pace down the flow coming from the initiator with the TCP window and the command flow control (MaxCmdRN). Regards, Pierre > > > Pierre Labat <pierre_labat@hp.com> on 10/10/2000 19:50:48 > > Please respond to Pierre Labat <pierre_labat@hp.com> > > To: ips@ece.cmu.edu > cc: > Subject: Re: iSCSI: Flow Control > > Julian_Satran@il.ibm.com wrote: > > > Pierre, > > > > The only point you are missing is that the TCP window may be closed when > > you want to send your > > Read command > > Julian, > > Yes, but as soon as the target re-open the window it receives the read > first. > > > and even if not it will reach the other end after all the data > > before it > > regardless of how clever your adapter is. > > The time used to reach the other end of the wire (for the read in our case) > is the same if there was data sent on the wire before or not. On the > target, as soon as the read is sampled from the wire it can be > processed. > > Regards, > > Pierre > > > The FIFO you have in mind is > > certainly not > > equivalent to the pipe capacity. > > > > Julo > > > > Pierre Labat <pierre_labat@hp.com> on 10/10/2000 02:58:41 > > > > Please respond to Pierre Labat <pierre_labat@hp.com> > > > > To: ips@ece.cmu.edu > > cc: > > Subject: Re: iSCSI: Flow Control > > > > Julian_Satran@il.ibm.com wrote: > > > > > Pierre, > > > > > > It does not matter how from where you send the data on the wire. > > > If you have a long wire and you want to cover the latency you will > > > send data as soon as you can and then commands get stuck behind. > > > > Julian, > > > > The command can NOT be stuck because there is "data on the wire". > > Let me give you an example, > > Let's talk again about the "pull model" adapter on the initiator. > > Imagine you have 100Mbytes of (write) data outstanding > > because 1000 cmds of large write commands have been posted to > > the adapter. > > The adapter sends this data as fast as it can. But very important, > > the data are not tossed in any kind of buffer on the adapter. > > What the adapter does is: pull some kbytes of data form host memory, > > encapsulate it, send it on the wire. Again and again, as fast as it can. > > > > Now, imagine that a read is posted to the adapter after the 1000 writes. > > Here is the point. The interface between the host and the adapter is not > > a FIFO but a flat array and the adapter can works in parallel on > > all the commands. Immediately when the host posts the read > > (in the flat array), the adapter sees it. The adapter as soon as it > > completes transmitting the current data PDU, sends the read command. > > > > The read command is not stuck behind the 100Mbytes of data. > > The maximum latency for the command is the time to > > transmit one iSCSI pdu on the wire. > > That is (size of pdu)/throughput. > > Then the adapter continues to send the write data of the > > 100Mbytes. And as soon as a new command will be posted, > > it will send a command pdu immediately after the current > > data PDU. > > > > Commands are not stuck behind data because there is no FIFO > > before the wire, and because data "on the wire" doesn't block anything. > > The wire is always able to deliver its throughput. > > > > Regards, > > > > Pierre > > > > > > > > > > > And nobody is suggesting you should park the data on the NIC card if > > > you know better. > > > > > > Julo > > > > > > Pierre Labat <pierre_labat@hp.com> on 09/10/2000 20:41:14 > > > > > > Please respond to Pierre Labat <pierre_labat@hp.com> > > > > > > To: Julian Satran/Haifa/IBM@IBMIL > > > cc: > > > Subject: Re: iSCSI: Flow Control > > > > > > julian_satran@il.ibm.com wrote: > > > > > > > Pierre, > > > > > > > > Sorry I missed a point about a - I though you where saying that > > > unsolicited > > > > data > > > > are not allowed. On this we are in agreement. > > > > > > > > On the rest - I can hardly follow. The model you suggest while valid > in > > a > > > > close > > > > scheme like a bus or short serial connection - in which the target > > > fetches > > > > data is closely matched by th R2T for data with no such match for > > > commands. > > > > Keeping track of how many commands where shipped for what LU is > > > impractical > > > > as we don't what per-LU state at the initiator (for the same reason > we > > > > rejected > > > > the connection per LU model). > > > > > > > > As for D - the point is that when you have a command to send and the > > > > command window > > > > is open you might have to wait a long time as the TCP window is > closed > > > > and/or you have > > > > a lot of data ahead. > > > > > > I think there is a misunderstanding about the model i was talking > about. > > > It's a pull model as implemented in some FC cards today and it is > assumed > > > that > > > > > > TCP/IP is handled on the adapter. It is the "no memory on adapter" > model > > > Somesh talked about. > > > > > > When a command comes out the SCSI layer, it is posted to the adapter. > > > At this point it is not posted in a queue but in a flat array of > > commands. > > > The data is till in host memory. > > > Let's assume the card can handle 1000 commands in parallel, the array > > > has 1000 entries. > > > The adapter is able to process this commands the way it wants > > > as far as it respects the protocol (iSCSI in our case). It could > > > be able to process them all in parallel if needed. > > > As it is a flat array, no commands are blocked by an other commands > > > or data. The adapter can pick (pull) whatever command or data > > > from host memory and send > > > it on the wire (again as far as it respect the protocol). > > > > > > Regards, > > > > > > Pierre
Home Last updated: Tue Sep 04 01:06:43 2001 6315 messages in chronological order |