Synchronization problem and TCP big window problem

To: "sip@cup.hp.com" <sip@cup.hp.com>, ips@ece.cmu.edu
Subject: Synchronization problem and TCP big window problem
From: Pierre Labat <pierre_labat@hp.com>
Date: Fri, 30 Jun 2000 20:36:03 -0700
Content-Type: multipart/mixed;boundary="------------8FDD8B8CB93EFA0D4E972EC7"
Organization: Hewlett Packard ATM-SISL
Sender: owner-ips@ece.cmu.edu

Hello,


This mail is about three problems outlined in the recent discussions :

a) the lost of synchronization in the TCP byte stream

b) in case of lost or out of order packet, the extra delay
   in the command completion on the
   initiator side. This leading to block
   the initiator because its command window is closed.

c) the quantity of TCP dedicated storage needed to
   cope with a full TCP compliant implementation
   on a fast link with a long round trip time.

About c)
--------
The problem is that the quantity of memory needed is
unlimited. More the link is fast, more the target is far
from the initiator and more TCP dedicated storage is needed.
Some calculations several did showed how big can be this memory
and we don't know where we are going in the future
the link being faster and faster.

This TCP dedicated memory is needed
to cope with the out of order or lost datagrams.
To have good performance one want to use SACK.
If a datagram is lost, the receive side have to store
all the byte stream incoming through the TCP pipe until
the send side re-transmit. This byte stream even
if acknowledged with SACK has to be
stored in a temporary TCP dedicated buffer. This is
because iSCSI can't process it. iSCSI lost the synchronization
due to the missing datagram. The quantity of memory
needed depends on the RTT and on the link speed
hence it can be very big and will get bigger and
bigger in the future. Hence a large memory buffer
will be needed just to handle error cases.


Proposition to solve these problems.
===================================

Add a "pad" command in iSCSI. This pad command is only one
byte: the opcode.

How does it works?
------------------

At the login time the initiator and the target agree on
two synchronization periods (SPEs). One for each direction.


A synchronization period is a number of bytes that
separates two synchronization points (SPO). At each SPO
the sender guarantees that it will put the beginning of
an iSCSI header, eventually adding some padding before
with the pad command.
The SPE value is implementation dependent and could be
determined based on the memory capacity of the receiver.
Shorter the SPE is, and less memory the receiver needs
to handle lost or out order datagrams.

On the receiver side when a hole in the TCP data stream
occurs (datagram lost), the receiver continues to SACK
the incoming data stream and store it in a TCP dedicated buffer
up to the next SPO. Then, from the SPO it can start again to
interpret the data stream and process it. It stops
copying in the TCP dedicated buffer. That means
for example, in case of WRITE data, copy the data on disk
in case of READ data, copy it into the host reception buffer,
in case of command completion, do the cleanup and so on.
When it receives the missing datagram it empties the
TCP dedicated buffer.


For example, if the receiver can store up to 5Mbytes of TCP
dedicated memory per TCP connexion it could choose
a SPE of 5Mbytes.

In case of bad quality line, if its dedicated memory get
full (because it got other holes in the data stream after
the re-synchronization and the first holes have not been
filled in by the sender), it drops everything new
it receives till it gets the missing datagrams.

Advantages of this proposal
===========================
1) Reduce the memory needed on the receive side while
  maintaining good performance

2) Cap the memory needed for TCP even with long RTT and
   increasing bandwidth

3) Allow synchronization check each SPE

4) Negligeable loss of bandwidth (padding)



Regards,


Pierre

Prev by Date: Re: More notes from Haifa
Next by Date: notes from the June 29 - phone conference
Prev by thread: New contact info
Next by thread: notes from the June 29 - phone conference
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:08:11 2001
6315 messages in chronological order