|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: More notes from Haifa
On Fri, 30 Jun 2000, Matt Wakeley wrote:
> > An optimization to iSCSI was discussed. The suggestion
> > was that each TCP segment be the start of a new iSCSI PDU.
> > This would be especially valuable for data transfers, as
> > each segment would have enough information to place the data
> > in memory at the remote end.
>
> You didn't note the results of this discussion. I thought I remember
> something that this could not be enforced upon TCP. Something about it could
> "send" a segment whenever it wants (thus that segment would not have a header)
> and that the push bit wouldn't work either, because tcp could accept a few
> more bytes after the push bit.
Your recollections match mine. Thanks for filling in the gap.
---------------------------
The motivation for the discussion are 1) to describe
a simpler fast-path for iSCSI, 2) minimize data buffering needs in
the presence of out-of-order reception.
The bulk of the traffic on an iSCSI TCP connection will be SCSI data.
SCSI data can be delivered to the buffer at each end in pretty much
any order, just as long as it is all delivered. So, as an
optimization, instead of keeping tons of SCSI data in a TCP receive
queue, an iSCSI w/optimized TCP could parse out-of-order segments
and deliver the SCSI data in them to the right SCSI buffer. This would
decrease the amount of buffering needed for TCP receive. In the
limit, this optimization would decrease memory requirements by up to a
factor of 2.
As you point out, implementations can't rely on alignment in a TCP stream
- to be interoperable they would have to implement unaligned parsing
too. However, it would be reasonable to implement aligned parsing
in the fast-path and kick unaligned parsing to the slow path.
TCP stacks don't know the alignment requirements of applications
because most applications fail to communicate message boundaries to the
TCP stacks. Historically, this is because the TCP stack ignores
message boundaries.
However, it is possible to communicate message boundaries through
the sockets interface! With the current TCP sockets interface, the sendmsg
command, coupled with the MSG_EOR (end of record) flag, could be used to
communicate iSCSI PDU boundaries to the TCP stack. The getopt
command could be used to get the current path MTU of the connection.
To communicate message boundaries over the wire, it
would be expedient to add a bit to the TCP header saying:
this segment is also the start of a new higher-layer PDU. This
could help out-of-order parsing of the TCP stream.
As for the PSH bit, RFC 1122, section 4.2.2.2, states "The PSH bit is not
a record marker..."
There are many constraints that can be envisioned
1) say nothing about alignment
2) iSCSI PDU headers do not span segments
3) if more than one iSCSI PDU header appears in a segment, one
iSCSI PDU header always appears at the start of the segment
4) zero or one iSCSI PDU header per segment and header always
aligned with start of segment.
- potentially creates lots of small segments
5) new segment always starts new iSCSI PDU
- implies iSCSI MTU <= TCP path MTU
- creates lots of small segments
My preference is #3.
-Costa
Home Last updated: Tue Sep 04 01:08:11 2001 6315 messages in chronological order |