Re: Multiple TCP connections

To: ips@ece.cmu.edu
Subject: Re: Multiple TCP connections
From: David Robinson <David.Robinson@EBay.Sun.COM>
Date: Wed, 9 Aug 2000 11:55:21 -0700 (PDT)
Content-MD5: Qf46JEYyQuGEBe3Tzhmv7w==
Content-Type: TEXT/plain; charset=us-ascii
Reply-To: David Robinson <David.Robinson@EBay.Sun.COM>
Sender: owner-ips@ece.cmu.edu

[This mailing list is acting up so forgive me if this is a repeat]

Randy provides a good summary of the "design team's" decision of why
they thought multiple connections per session.

My contention is that the argument is inverted that you need multiple
paths through the fabric to get performance and current link layer
technology (802.3ad) only provides concurrency based on TCP layer
headers thus multiple connections are needed.  Further given multiple
connections per session you can't have a connection per LUN as there
will not be enough connections based on simple math.

If you work the argument backwards I see a different result.  If you
assume instead that we have one connection per LUN we should look at
the number of concurrent connections possible.  TCP with its 16 bit
port number limit us to 64K ports therefore only 64K active connections
per IP interface.  Given the high bandwidth of existing drives,
especially with the amount of cache appearing in controllers, just 10%
of the possible connections active will saturate a link layer
technology for the next few decades. Therefore to get to the range of
that many LUNs and thus connections you will need multiple IP
interfaces.  Even with the existing draft proposal, no sane implementor
would throttle 10K+ LUNs with a single IP interface.

I would propose that the requirement be 64K active sessions.  Given
that requirement having a session per LUN makes sense.

The next issue is performance.  I agree that to get maximal use of a
fabric you need to exploit concurrency. The question becomes where is
the correct place to put the concurrency.  If we follow the argument
that we should have a session per LUN and the standard semantics of
LUNs are in order request/response with minimal concurrency leading to
a per connection performance requirement that is on par with today's
link level technology and protocol implementations and will likely grow
at the same rates. The throughput performance that is really a concern
is the aggregate bandwidth of an initiator to multiple LUNs. With a
session per LUN, each TCP connection can be placed on a different link
layer channel (802.3ad) using TCP layer header information. Therefore
the performance will scale with the link layer improvements using the
existing link layer aggregation mechanism.  Ultimately the initiator to
LUN bandwidth will be a host memory to storage contoller cache memory
copy, currently being designed interconnect technology like Infiniband
is exactly designed for such types of copies, so as storage devices
move towards a memory to memory model the interconnects will exist.

The issue no longer becomes trying to figure out how to exploit
multi-link concurrency for a single TCP stream, but what the contents
of a single TCP stream is, given that we know existing link technology
will allow multiple TCP streams to be passed concurrently.  I assert
that using a LUN is the natural level of concurrency and the
performance demand of a single LUN can be met by existing TCP
implementations and should scale over time.

The numerical argument given claims that a single storage controller
may have 160K concurrent connections.  That is likely to be an extreme
case with a poorly balanced set of hardware but I will grant it for the
sake of argument. The argument is posed that this will be too expensive
to maintain that much TCP state. The proposed cost is ~10MB of memory
which today will at about $1 to the cost of a box containing 10,000
disk drives (~$1M assuming $100 drives). Not a compelling argument.
Furthermore, if you multiplex multiple LUNs per connection you still
need sufficient state to mux-demux requests which will be on the same
order of magnitude as TCP state. So ultimately the "cost" argument is a
wash.

> Conclusion: one (or two) TCP connections per LU is both too many (resulting
> in too much memory devoted to state records) and too few (insufficient
> bandwidth for high-speed IO to controller cache).  Decoupling the number of
> TCP connections from the number of LUs is the necessary result.

I don't buy the conclusion, the amount of memory devoted to state
records is relatively small and is actually constant regardless of
whether the mux-demux is done at the TCP layer or session layer.  Also
the driver for interconnect technology is memory to memory copying
so the advances in storage technology will not likely outgrow the link
layer.


	-David

Follow-Ups:
- Re: Multiple TCP connections
  - From: Michael Krause <krause@cup.hp.com>

Prev by Date: Re: Multiple connections & design complexity
Next by Date: Re: Requirements specification
Prev by thread: RE: Multiple TCP connections
Next by thread: Re: Multiple TCP connections
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:07:55 2001
6315 messages in chronological order