|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: iSCSI: RE: Framing Discussion> Now, I have another dumb question. For direct data placement, the > discussions have been centered mostly around the need for alignment when > parsing PDU's on the receiving iSCSI TOE. One potential issue that has not > been discussed is the problem of how to handle re-transmission on the > sending iSCSI TOE. > From previous discussions, I am assuming that our goal is to > avoid having a network BWDP worth of memory on the NIC. The receiver > can avoid this memory by recovering PDU alignment in the TCP stream > and using the self-describing headers in the wire protocol > (either iSCSI offsets or a RDMA shim layer) to put the data > directly in the buffer cache. On the sending side, we can DMA > directly from iSCSI descriptor CDB's into the TCP pipe using a hardware > path. But, unless we keep all of those un-acked TCP segment buffers around > in the NIC, it will be difficult to recover the context when we have to > re-transmit. Wayland, I enjoyed reading your questions. Obviously, you have given a lot of thoughts on implementing the iSCSI. To answer your question on retransmit, just remember it is actually very easy. You are the sender. You have total control -- I assume you are not using an existing TCP implementation. Therefore, from the TCP segment sequence number as a sender you should know how to reassemble the data. No extra buffers are needed. > Let's suppose that we have an iSCSI TCP connection in which we > have multiple outstanding I/O's. Thus, the byte stream has > interleaved within it commands and data from different I/O's. > When we detect a dropped segment either through normal TCP congestion > or via SACK, how do we map the missing byte block to the > appropriate context? If we keep the segments around, then we > could match the missing segment easily and re-transmit. But that would > require the NIC to implement a BWDP's worth of transmit buffer memory. > > To have the iSCSI TOE re-transmit directly from the buffer cache, it seems > that we would need some sort of context that would allow us to map a byte > window to a specific, meaningful point somewhere in the middle of a CDB > context. Essentially, you need enough context to be able to re-construct > the TCP fifo since the memory in this fifo has since been effectively > re-allocated. Maybe this isn't too hard, but it sure sounds like > a difficult problem for hardware to solve. But, as the software > folks around here keep telling me, "it's just gates" ;-) Yes, multiple I/O's and interleaved data streams require a context manager who maps the missing segment back to its large exchange table to determine how to retransmit the dropped segment. No, Wayland, I would not do it in hardware. It is all in microcode. The microcode size is actually not that big. On the contrary, the exchange table can be a few hundred KB's. All you need is a very very fast microengine with small number of gates, a true RISC. Please keep asking the "dumb" questions. I am mostly impressed by your questions. Y.P. Cheng, Connectom Solutions.
Home Last updated: Tue Sep 04 01:06:01 2001 6315 messages in chronological order |