|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Multiple TCP connections[This mailing list is acting up so forgive me if this is a repeat] Randy provides a good summary of the "design team's" decision of why they thought multiple connections per session. My contention is that the argument is inverted that you need multiple paths through the fabric to get performance and current link layer technology (802.3ad) only provides concurrency based on TCP layer headers thus multiple connections are needed. Further given multiple connections per session you can't have a connection per LUN as there will not be enough connections based on simple math. If you work the argument backwards I see a different result. If you assume instead that we have one connection per LUN we should look at the number of concurrent connections possible. TCP with its 16 bit port number limit us to 64K ports therefore only 64K active connections per IP interface. Given the high bandwidth of existing drives, especially with the amount of cache appearing in controllers, just 10% of the possible connections active will saturate a link layer technology for the next few decades. Therefore to get to the range of that many LUNs and thus connections you will need multiple IP interfaces. Even with the existing draft proposal, no sane implementor would throttle 10K+ LUNs with a single IP interface. I would propose that the requirement be 64K active sessions. Given that requirement having a session per LUN makes sense. The next issue is performance. I agree that to get maximal use of a fabric you need to exploit concurrency. The question becomes where is the correct place to put the concurrency. If we follow the argument that we should have a session per LUN and the standard semantics of LUNs are in order request/response with minimal concurrency leading to a per connection performance requirement that is on par with today's link level technology and protocol implementations and will likely grow at the same rates. The throughput performance that is really a concern is the aggregate bandwidth of an initiator to multiple LUNs. With a session per LUN, each TCP connection can be placed on a different link layer channel (802.3ad) using TCP layer header information. Therefore the performance will scale with the link layer improvements using the existing link layer aggregation mechanism. Ultimately the initiator to LUN bandwidth will be a host memory to storage contoller cache memory copy, currently being designed interconnect technology like Infiniband is exactly designed for such types of copies, so as storage devices move towards a memory to memory model the interconnects will exist. The issue no longer becomes trying to figure out how to exploit multi-link concurrency for a single TCP stream, but what the contents of a single TCP stream is, given that we know existing link technology will allow multiple TCP streams to be passed concurrently. I assert that using a LUN is the natural level of concurrency and the performance demand of a single LUN can be met by existing TCP implementations and should scale over time. The numerical argument given claims that a single storage controller may have 160K concurrent connections. That is likely to be an extreme case with a poorly balanced set of hardware but I will grant it for the sake of argument. The argument is posed that this will be too expensive to maintain that much TCP state. The proposed cost is ~10MB of memory which today will at about $1 to the cost of a box containing 10,000 disk drives (~$1M assuming $100 drives). Not a compelling argument. Furthermore, if you multiplex multiple LUNs per connection you still need sufficient state to mux-demux requests which will be on the same order of magnitude as TCP state. So ultimately the "cost" argument is a wash. > Conclusion: one (or two) TCP connections per LU is both too many (resulting > in too much memory devoted to state records) and too few (insufficient > bandwidth for high-speed IO to controller cache). Decoupling the number of > TCP connections from the number of LUs is the necessary result. I don't buy the conclusion, the amount of memory devoted to state records is relatively small and is actually constant regardless of whether the mux-demux is done at the TCP layer or session layer. Also the driver for interconnect technology is memory to memory copying so the advances in storage technology will not likely outgrow the link layer. -David
Home Last updated: Tue Sep 04 01:07:55 2001 6315 messages in chronological order |