SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: iSCSI: RE: Framing Discussion



    > Now, I have another dumb question. For direct data placement, the
    > discussions have been centered mostly around the need for alignment when
    > parsing PDU's on the receiving iSCSI TOE. One potential issue that has not
    > been discussed is the problem of how to handle re-transmission on the
    > sending iSCSI TOE.
    > From previous discussions, I am assuming that our goal is to
    > avoid having a network BWDP worth of memory on the NIC. The receiver
    > can avoid this memory by recovering PDU alignment in the TCP stream
    > and using the self-describing headers in the wire protocol
    > (either iSCSI offsets or a RDMA shim layer) to put the data
    > directly in the buffer cache. On the sending side, we can DMA
    > directly from iSCSI descriptor CDB's into the TCP pipe using a hardware
    > path. But, unless we keep all of those un-acked TCP segment buffers around
    > in the NIC, it will be difficult to recover the context when we have to
    > re-transmit.
    
    Wayland, I enjoyed reading your questions.  Obviously, you have given a lot
    of thoughts on implementing the iSCSI.  To answer your question on
    retransmit, just remember it is actually very easy.  You are the sender. You
    have total control -- I assume you are not using an existing TCP
    implementation.  Therefore, from the TCP segment sequence number as a sender
    you should know how to reassemble the data.  No extra buffers are needed.
    
    > Let's suppose that we have an iSCSI TCP connection in which we
    > have multiple outstanding I/O's. Thus, the byte stream has
    > interleaved within it commands and data from different I/O's.
    > When we detect a dropped segment either through normal TCP congestion
    > or via SACK, how do we map the missing byte block to the
    > appropriate context? If we keep the segments around, then we
    > could match the missing segment easily and re-transmit. But that would
    > require the NIC to implement a BWDP's worth of transmit buffer memory.
    >
    > To have the iSCSI TOE re-transmit directly from the buffer cache, it seems
    > that we would need some sort of context that would allow us to map a byte
    > window to a specific, meaningful point somewhere in the middle of a CDB
    > context. Essentially, you need enough context to be able to re-construct
    > the TCP fifo since the memory in this fifo has since been effectively
    > re-allocated. Maybe this isn't too hard, but it sure sounds like
    > a difficult problem for hardware to solve. But, as the software
    > folks around here keep telling me, "it's just gates" ;-)
    
    Yes, multiple I/O's and interleaved data streams require a context manager
    who maps the missing segment back to its large exchange table to determine
    how to retransmit the dropped segment.  No, Wayland, I would not do it in
    hardware.  It is all in microcode.  The microcode size is actually not that
    big. On the contrary, the exchange table can be a few hundred KB's.  All you
    need is a very very fast microengine with small number of gates, a true
    RISC.  Please keep asking the "dumb" questions.  I am mostly impressed by
    your questions.
    
    Y.P. Cheng, Connectom Solutions.
    
    


Home

Last updated: Tue Sep 04 01:06:01 2001
6315 messages in chronological order