Re: iSCSI: Markers

To: Stuart Cheshire <cheshire@apple.com>
Subject: Re: iSCSI: Markers
From: "John Hufferd" <hufferd@us.ibm.com>
Date: Sat, 12 Jan 2002 03:17:33 -0800
Cc: <ips@ece.cmu.edu>
Content-type: text/plain; charset=us-ascii
Importance: Normal
Sender: owner-ips@ece.cmu.edu

Stuart,
thanks for the contribution.

I found your views interesting, however, in the last part of your message,
when you attempted to contrast it to FIM, you mixed a Framing discussion
with a marker discussion.  Here is what I mean:

You made much of your arguments to address COWS as being part of a Framing
(Forcing Segmentation alignment with PDUs).  Many of us are already a fan
of Framing, and understand that both the Key+length and COWS have an
important debate to hold with lots of real world ASIC Vendor Input needed.

But then you jumped to contrasting COWS with Framing vrs FIM.  That has not
been the focus of the discussion.  I can not think of anyone that is
suggesting FIM in place of any Framing approach.  Framing is the goal, and
honest debate on the Type, will be needed and be useful.  The debate then
occurs in two areas, what do we do until we have a Framing solution
(Framing being out of the domain of iSCSI, so we must wait until one is
accepted before we can point to it.), and what kind of a Framing solution
should we wish, so that we can influence the Framing debate.  Hence the Six
Options.

So some of your discussions on COWS, is completely valid for Framing, but
is not quite as useful, without framing. When it comes to not having
Framing (with Segmentation starts, etc) some will argue that COWS is much
more complex, and much higher overhead.

So again, thank you for your contribution, I just wanted to correct that
slight misalignment at the end of your note.

.
.
.
John L. Hufferd
Senior Technical Staff Member (STSM)
IBM/SSG San Jose Ca
Main Office (408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
Home Office (408) 997-6136, Cell: (408) 499-9702
Internet address: hufferd@us.ibm.com


Stuart Cheshire <cheshire@apple.com> on 01/11/2002 05:39:17 PM

To:    John Hufferd/San Jose/IBM@IBMUS, <ips@ece.cmu.edu>
cc:
Subject:    Re: iSCSI: Markers



I'm new to this list, so I should introduce myself.

My name is Stuart Cheshire; I'm the author of Consistent Overhead Byte
Stuffing (COBS), the framing technique from which COWS derives. I'm not
working on any iSCSI product, but if COBS can contribute to iSCSI, then
I'm happy to offer a little of my time, as much as I can spare, to help
clarify what COWS does and does not do, to help people make an informed
decision whether or not COWS is the right solution for iSCSI.

----

Assumption: A high-performance receiver is harder than a high-performance
sender.

This is because the sender is in control. It knows where the data is
coming from in memory, and where it is going to on the network. The
sending host knows and can control all aspects of the communication: what
order iSCSI messages are delivered onto the wire, how big each one is,
and at what time they are sent. If the sender wants to do some kind of
housekeeping that prevents it from sending packets for a few
milliseconds, then it has the option of doing that without terrible
consequences.

The receiver has a much harder time. It never knows what packet is going
to arrive next, or how big it will be, or where it will be from, or where
it will have to go to in memory. Packet loss/corruption/reordering makes
things even more unpredictable. A receiver doesn't have the luxury of
being able to not receive packets for a few milliseconds if it is busy
with something else.

For this reason, it makes sense to see what the sender can do to make the
receiver's life a little easier. If the receiver could receive each TCP
segment and process it in isolation, determining where to place it in
memory solely from information within that TCP segment, without reference
to data from other TCP segments (which may not have arrived yet), then it
would be easier to make a high-performance receiver.

What can we do to enable independent segment processing and idempotent
direct data placement at the receiver?

My first choice would be to add a couple of extra bits to the TCP header;
a "start of message" bit and an "end of message" bit. The "start of
message" bit indicates that the first byte of TCP data in the segment is
also the first byte of an iSCSI message; the "end of message" bit
indicates that the last byte of TCP data in the segment is also the last
byte of an iSCSI message. When a receiver receives a TCP segment with
both bits set, it knows with certainty that it has one (or more) complete
iSCSI messages in the TCP segment and can immediately decode enough of
the iSCSI message header(s) to determine where in memory to place the
data.

Unfortunately, adding extra bits to the TCP header is not viable. From a
political point of view, trying to change the TCP on-the-wire protocol is
a non-starter. From a practical point of view, there are too many routers
and firewalls and similar devices that will throw away TCP packets with
bits they don't understand.

Given that out-of-band framing using header bits is not possible, the
alternative is in-band framing using only information in the TCP data
stream itself.

If we can design our sender to normally send exactly one iSCSI message
per TCP segment, and we have a way for our receiver to reliably verify
that the received TCP segment contains exactly one iSCSI message, then
the receiver can implement idempotent direct data placement for each TCP
segment as it is received, without reference to state from previous TCP
segments on that connection (which may not have arrived yet).

The problem left to solve is how the receiver can reliably verify that
the received TCP segment contains exactly one iSCSI message. It can do
this by checking to see whether the TCP segment data begins with some
special marker pattern, as long as it knows that this special marker
pattern cannot appear anywhere within the body of valid iSCSI message
data. This necessarily entails processing ("stuffing") the body of the
iSCSI message to eliminate inadvertent occurrences of the special marker
pattern before sending, and then reversing this transformation to restore
the original data after reception.

If the receiver finds that the segment does not begin with the special
marker pattern, then it knows that the sender segmentation has not been
maintained (or it is talking to an old TCP sender that doesn't support
sender segmentation) and it has to fall back to treating the TCP data
stream as a raw unstructured byte stream, with message boundaries
indicated by occurrences of the the marker pattern. The important thing
is that the receiver still works correctly, even though the performance
will be lower.

This prefer-sender-segmentation-but-verify approach is important. If the
outgoing data is not processed to guarantee that the special marker
pattern cannot occur, then malicious users might be able to subvert the
protocol by putting contrived patterns in their data. Remember the days
where you could make a user's modem hang up by sending them an email
containing the text "+++ATH"? (Apologies to anyone reading this via modem
who just had their telephone line hang up.)

Another benefit of using in-band framing like this is that we can deploy
it immediately using unmodified TCP stacks. In the future we can use
enhanced sender TCP implementations that take steps to maintain segment
boundaries, and smart receivers will get a performance boost from that,
but it is a compatible upgrade that changes only the implementation, not
the on-the-wire protocol.

Of course, we don't get anything for free. If we want to receiver to be
able to determine with 100% certainty that it has received a complete
iSCSI message in one TCP segment, then the sender will have to do some
work to enable that. This is the cost of COWS. It gives 100% framing
certainty, but at the cost of checking the outgoing data for inadvertent
occurrences of the special marker pattern, and eliminating them. There's
no way for a sender to tell whether the outgoing data contains
inadvertent occurrences of the special marker pattern if the sender is
not willing to look at the data.

On the plus side, the cost of COWS encoding is modest compared to some
alternatives. COWS-encoding adds a little header but otherwise doesn't
change the size of the outgoing data, ever. No matter how many
occurrences of the framing marker pattern are found, the encoded output
length is always exactly the same: the length of the input plus the
length of the fixed-size framing header (typically two words). If the
framing marker pattern is chosen to be something that is rare in normal
(non-malicious) data, then in the common-case the encoding step will be a
read-only operation: scan the data, determine that it contains no framing
markers, set the COWS header to indicate that the data contains no
framing markers, and send it.

In contrast, when using Fixed Interval Markers, if a marker happens to
fall in the middle of the data you are sending, then it creates a 'hole'
in the middle of data that used to be contiguous, and the block of
outgoing data changes size. On the receiving side, the 'hole' created by
the marker has to be repaired in the process of transferring the data
into memory. When using Fixed Interval Markers, when a receiver gets a
TCP segment that contains no marker, it cannot reliably determine what it
is supposed to do with that segment (where to put it in memory) without
referring the state from the previous TCP segments of that connection. I
don't believe that FIM can provide efficient idempotent direct data
placement for inbound TCP segments, because you can't rely on any given
received segment containing a marker via which the receiver can verify
that the segment contains a complete iSCSI message.

In summary:

My first choice would be to modify the TCP protocol to support
preservation of upper-level message boundaries.

Given that this is not possible, I think COWS provdes a good alternative.

Stuart Cheshire <cheshire@apple.com>
 * Wizard Without Portfolio, Apple Computer
 * Chairman, IETF ZEROCONF
 * www.stuartcheshire.org
Follow-Ups:
- RE: iSCSI: Markers
  - From: "Somesh Gupta" <somesh_gupta@silverbacksystems.com>
Prev by Date: RE: iSCSI: Markers and Framing
Next by Date: RE: iSCSI: Markers and Framing
Prev by thread: RE: iSCSI: Markers
Next by thread: RE: iSCSI: Markers
Index(es):
- Date
- Thread
Home
Last updated: Sat Jan 12 18:17:54 2002
8376 messages in chronological order