Re: [Tsvwg] [SCTP checksum problems]

To: tsvwg@ietf.org
Subject: Re: [Tsvwg] [SCTP checksum problems]
From: Randall Stewart <rrs@cisco.com>
Date: Fri, 27 Apr 2001 02:43:27 -0500
CC: Chip Sharp <chsharp@cisco.com>, ips@ece.cmu.edu, Craig Partridge <craig@aland.bbn.com>, Jonathan Stone <jonathan@dsg.stanford.edu>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=us-ascii
References: <NEBBJGDMMLHHCIKHGBEJMEODCGAA.dotis@sanlight.net>
Sender: owner-ips@ece.cmu.edu
All...

After a review of the numbers I realize I made a very
large error :-< (egg on face) in my analysis of the gprof numbers....
yikes!!

Below are the updated real numbers...


I will also be re-running these on a pentium 100 and with Jonathan's
help
we will tune up both the CRC32 and the adler algorithms and adds some
more
such as fletcher etc..

Please consider these pre-liminary numbers until Jonathan and I can do
a more detailed analysis ..


New numbers inserted below :)

Thanks

R
> 
> > Jim:
> >
> > I am glad you copied me on this.. since being at the bakeoff and with
> > the
> > recent email/site problems I have.. I do not get IETF mail until next
> > week :0
> >
> > Now, some comments...
> >
> >
> > My major concern is that SCTP's checksum is weaker than TCP.What the
> > upper layers do to defend against middle boxes etc they will need to
> > do anyway. I have just ran a few numbers using the sctp_test_app in
> > the reference implementation.
> >
> > I did the following:
> >
> > Compiled the normal ref-imp with -O -pg
> >
> > started two endpoints on the same machine (freeBSD intel/sony vio
> > PCG-Z505JS), this
> > of course has my SCTP kernel patches applied ...
> >
> > Now I setup an association between two endpoints and did
> >
> > bulk:100:0:1000000
> >
> > This transfers 1 Million 100 byte packets holding ascii data from one
> > endpoint to the other.
> >
> > I then captured the gprof information for this run.
> >
> > I did the same exact test after changing the checksum to crc32 and the
> > modified
> > adler32 (16 bit sums).
> >
> > My results (not meant to show strength in catching errors but instead
> > performance
> > of a software version of the sum) are as follows:
> >
> > Adler 32...
> >
> > Sender side and Receiver side Avereage time spent in the checksum
> > routine per
> > call was 121 nano seconds.
> >
******* corrected number******
5.1 microseconds 

> > Adler 32 Modified
> >
> > Sender side and Receiver side Average time spent in the checksum routine
> > per
> > call was 90 nano seconds.

******** corrected number******
3.9 microseconds.


> >
> > CRC32
> >
> > Sender side average time spent in the checksum calculation per packet
> > was
> > 5.3 micro seconds
> >
> > Recevier side average time spent in the checksum calculation per packet
> > was
> > 5.9 micro seconds.
> >
> > I believe the differences seen in the CRC32 can be attributed to how
> > lucky the
> > repsective application is in finding the index table (the ssh_crc32
> > found in
> > FreeBSD) in the processor cache. If I run the CRC comparision with just
> > random
> > data in a stand alone program (very unrealistic) crc32 outperforms all
> > others but
> > this is because the table get completely pre-fetched into cache and is
> > never
> > pulled from main memory... (I started here and then realized the only
> > way to
> > get good information is to do it in a real implementation where a lot of
> > other
> > code would run between crc calls).
> >
> > So on that note... I vote STRONGLY for Modified Adler32... i.e. same as
> > regular
> > adler but make the quantities added be 16 bit sums...
> >
> > I think this will take care of the critical problem i.e. the weakness in
> > the
> > SCTP checksum for small packets... Jonathan/Craig any comments or
> > questions...
> >
> > And yess... I am working on getting things ran on a sparc box but the
> > only on
> > we have here is a sparc20... so the numbers definetly can NOT be
> > compared to anything
> > else...
> >
> > R
> >
> >
> >
> >
> > "WENDT,JIM (HP-Roseville,ex1)" wrote:
> > >
> > > I think this "SCTP checksum" thread spanning IPS and TSVWG was for
> > > discussion around whether or not iSCSI (running over SCTP)
> > could forgo data
> > > integrity checking and transport-like functionality
> > (retransmission, ack,
> > > etc) should SCTP provide a sufficiently strong check-code.
> > > If iSCSI were willing to completely trust SCTP end-to-end
> > across a network
> > > fabric (including "middleboxes"), then that provides one reason
> > for SCTP to
> > > adopt a stronger checksum or CRC.
> > > If iSCSI will still implement its own data integrity check-code
> > above SCTP,
> > > then SCTP needs to make an independent decision on whether its current
> > > check-code is sufficiently strong for its target uses.
> > > Currently, iSCSI contains a data integrity check "digest" that can be
> > > negotiated end-to-end to be disabled on a per-connection basis.
> > >
> > > This discussion begs a few questions:
> > > - Are there clearly different classes of applications (in
> > regards to their
> > > end-to-end data integrity strength needs)?
> > > - How are these application classes' end-to-end data integrity
> > needs meet in
> > > the future?  Is it SCTP, IPSec, application-specific protocol, a new
> > > protocol?
> > > - Is there a general need for strong end-to-end data integrity
> > that could be
> > > provided for in a recommended generic manner?
> > > - Is iSCSI unique in being an "ultra-low error rate
> > application" and should
> > > iSCSI then handle its own data integrity?
> > > - Should SCTP strengthen its checksum to meet the needs of a
> > general class
> > > of data-criticial applications, and/or provide a means for
> > negotiating an
> > > optional stronger checksum?
> > > - What is the role of network infrastructure (router/middlebox
> > hardware and
> > > software) in strengthening end-to-end data integrity?
> > >
> > > Data integrity for iSCSI over TCP is a separate issue. It is
> > unlikely that
> > > we will be able to evolve TCP in a timely manner to utilize a stronger
> > > check-code given TCP's current wide scale deployment (although adding a
> > > stronger checksum/CRC to TCP would seem to be the best solution). So,
> > > something else has to be done either above or below TCP to provide the
> > > required level of iSCSI data integrity. Of course, if TCP's
> > data integrity
> > > deficiency is impacting other data-critical applications, then it seems
> > > prudent to at least consider solving the problem generically.
> > >
> > > Jim
> > >
> > > > -----Original Message-----
> > > > From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
> > > > Sent: Friday, April 20, 2001 1:02 AM
> > > > To: Chip Sharp
> > > > Cc: vince_cavanna@agilent.com; steph@cs.uchicago.edu; WENDT,JIM
> > > > (HP-Roseville,ex1); ips@ece.cmu.edu; tsvwg@ietf.org;
> > > > craig@aland.bbn.com; Jonathan.Wood@sun.com; xieqb@cig.mot.com;
> > > > jonathan@dsg.stanford.edu; rrs@cisco.com
> > > > Subject: RE: [Tsvwg] [SCTP checksum problems]
> > > >
> > > >
> > > >
> > > >
> > > > Chip,
> > > >
> > > > CRC s are not meant to protect against malicious middle boxes
> > > > - rather on
> > > > boxes that strip the strong link CRCs and
> > > > let the end-system rely on the weak TCP checksum.
> > > >
> > > > NAT boxes have good reason to recompute TCP checksums, but
> > > > unless they are
> > > > malicious no reason to recompute iSCSI CRCs.
> > > >
> > > > And against malicious boxes iSCSI has cryptographic digests
> > > > as options.
> > > >
> > > > And I was not aware that we are discussing - in this forum -
> > > > iSCSI data
> > > > integrity options.
> > > >
> > > > Julo
> > > >
> > > > Chip Sharp <chsharp@cisco.com> on 19/04/2001 18:53:53
> > > >
> > > > Please respond to Chip Sharp <chsharp@cisco.com>
> > > >
> > > > To:   vince_cavanna@agilent.com
> > > > cc:   steph@cs.uchicago.edu, vince_cavanna@agilent.com,
> > > > jim_wendt@hp.com,
> > > >       Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu, tsvwg@ietf.org,
> > > >       craig@aland.bbn.com, Jonathan.Wood@sun.com, xieqb@cig.mot.com,
> > > >       jonathan@dsg.stanford.edu, rrs@cisco.com
> > > > Subject:  RE: [Tsvwg] [SCTP checksum problems]
> > > >
> > > >
> > > >
> > > >
> > > > As was pointed out previously, middle box operations (such as
> > > > NATs) tend to
> > > > creep up the protocol stack and into applications.
> > > >
> > > > Take SIP for example.  It includes IP addresses in its
> > > > INVITE.  In order to
> > > > work across a NAT, the IP addresses it exchanges have to be
> > > > replaced with
> > > > the NATed address.  One way is for the NAT to reach up into
> > > > the SIP INVITE
> > > > and change the address.  This modifies the TCP or UDP
> > > > checksum.  Now SIP
> > > > could have included its own integrity check to protect
> > > > against corrupted or
> > > > modified TCP checksums, but all that would have happened is
> > > > that NATs would
> > > > have changed the SIP checksum in addition to the TCP/UDP checksum.
> > > >
> > > > Therefore, even if iSCSI included its own integrity check, if
> > > > a middle box
> > > > is going to futz with iSCSI packets it will just strip the check, do
> > > > whatever it does and then recalculate the check.
> > > >
> > > > If this is what you want to protect against you will have to
> > > > go to some
> > > > type of digital signature.
> > > >
> > > > At 12:22 PM 4/19/2001, vince_cavanna@agilent.com wrote:
> > > > >Stephen,
> > > > >
> > > > >I have to admit that I do not have much direct experience with middle
> > > > boxes,
> > > > >BUT I did have fairly direct and recent experience with a popular NAT
> > > > router
> > > > >from a popular vendor that was corrupting data in a network of
> > > > Macintoshes.
> > > > >
> > > > >Apple's TCP was unaware of any problem as was Apple's Filing
> > > > Protocol and
> > > > >most applications. The only applications that detected the
> > > > corruption were
> > > > >those that performed an integrity check of their own. Those
> > > > applications
> > > > >that assumed a reliable transport (and file system) were doomed to
> > > > >experiencing the indirect effects of the corruption at some
> > > > later time.
> > > > The
> > > > >corruption only happened when large amounts of data were transferred
> > > > >quickly.  The router vendor fixed the problem once; then
> > > > fixed it again;
> > > > >then fixed it one last time before the data corruption finally
> > > > >"disappeared". After several weeks of continuous operation the router
> > > > >appeared to get into a mode where it was once again
> > > > corrupting data. Power
> > > > >cycling the router "fixed it". The story apparently has not
> > > > yet ended.
> > > > >
> > > > >I admit I may have given too much significance to this
> > > > single incident
> > > > that
> > > > >I have personally experienced but on the other hand I don't see the
> > > > >mechanisms in place to prevent this type of problem in the
> > > > future other
> > > > than
> > > > >the end to end integrity checks.
> > > > >
> > > > >Incidentally this incident change my behavior when
> > > > transferring data over
> > > > a
> > > > >network. I will always use a compression utility; not only
> > > > for reducing
> > > > the
> > > > >data to be transmitted but to ensure the integrity of my
> > > > data is protected
> > > > >end to end by the utility's CRC mechanism.
> > > > >
> > > > >I believe quite firmly that we DO need a mechanism to allow
> > > > us to tolerate
> > > > >poor implementations of middle boxes and cannot simply hope that
> > > > eventually
> > > > >such poor implementations will vanish, nor that we will have
> > > > the luxury of
> > > > >being able to select only good implementations for every
> > > > component of our
> > > > >storage network.
> > > > >
> > > > >Vince
> > > > >
> > > > >|-----Original Message-----
> > > > >|From: Stephen Bailey [mailto:steph@cs.uchicago.edu]
> > > > >|Sent: Wednesday, April 18, 2001 3:09 PM
> > > > >|To: CAVANNA,VICENTE V (A-Roseville,ex1)
> > > > >|Cc: 'WENDT,JIM (HP-Roseville,ex1)'; 'julian_satran@il.ibm.com';
> > > > >|ips@ece.cmu.edu; tsvwg@ietf.org; 'Craig Partridge'; Jonathan Wood;
> > > > >|xieqb@cig.mot.com; Jonathan Stone; Randall Stewart
> > > > >|Subject: Re: [Tsvwg] [SCTP checksum problems]
> > > > >|
> > > > >|
> > > > >|Vince,
> > > > >|
> > > > >|> I don't think iSCSI can be completely relieved of performing
> > > > >|some data
> > > > >|> integrity checking as long as there exists the possibility
> > > > >|of "middle boxes"
> > > > >|> opening up the transport protocol's packet and thus
> > > > >|potentially invalidating
> > > > >|> any reliability guarantees the transport protocol makes.
> > > > >|
> > > > >|Any protection provided against this failure mode will only be
> > > > >|transient, so we must temper the desire to introduce such a
> > > > >|requirement with reality.
> > > > >|
> > > > >|Middleboxes can just as easily open up to the iSCSI layer and tinker
> > > > >|with the payload, as they do with other ULPs running on TCP
> > > > (e.g HTTP)
> > > > >|today.  Short of securing the connection, there is ALWAYS a
> > > > >|possibility of a middlebox terminating and reoriginating an
> > > > integrity
> > > > >|check.  In case you think this is a farfetched scenario, I
> > > > do get the
> > > > >|impression that there is a high level of interest in `actively
> > > > >|middling' iSCSI once the specs crystalize.  Who shaves the barber?
> > > > >|
> > > > >|An integrity check is not necessary as long as some lower layer
> > > > >|provides adequate integrity guarantees.
> > > > >|
> > > > >|Adding an integrity check above the transport layer is based upon
> > > > >|documentation of the presence of a lot of crappy network
> > > > hardware and
> > > > >|software and analyses of the transport integrity check (TCP
> > > > checksum)
> > > > >|which suggests it might not be adequately strong against some such
> > > > >|observed errors.
> > > > >|
> > > > >|I claim that the high incidence of `broken' (corruption introducing)
> > > > >|components is a result of a variety of factors which have shaped the
> > > > >|development of network components thus far.  The fact that integrity
> > > > >|checks are assumed to be performed in a network context
> > > > substantially
> > > > >|lowers the bar for implementation correctness.
> > > > >|
> > > > >|In a storage (or CPU) context, these types of implementation errors
> > > > >|are a) more easily detectable (more fatal) b) more carefully avoided
> > > > >|during implementation (because of the cost of a potential fatal
> > > > >|error).  If network components magically reached the same `quality
> > > > >|level' as storage and CPU components, there might be no
> > > > justification
> > > > >|for additional integrity checks above the transport.
> > > > Similarly if the
> > > > >|transport (or whatever lower layer) integrity checks are very strong
> > > > >|(e.g. IPSec), there is, again, no need for a higher level integrity
> > > > >|check.
> > > > >|
> > > > >|I am not disagreeing that we need an additional integrity check over
> > > > >|TCP in the present target environment, but I do disagree that iSCSI
> > > > >|will always need such a check, independently of what is running
> > > > >|beneath it.
> > > > >|
> > > > >|Steph
> > > > >|
> > > >
> > > >
> > > > -------------------------------------------------------------------
> > > > Chip Sharp                       Consulting Engineering
> > > > Cisco Systems
> > > > -------------------------------------------------------------------
> > > >
> > > >
> > > >
> > > >
> >
> > --
> > Randall R. Stewart
> > Systems & Solutions Engineering
> > Cisco Systems Inc.
> > rrs@cisco.com 815-342-5222 or 815-477-2127
> >
> > _______________________________________________
> > tsvwg mailing list
> > tsvwg@ietf.org
> > http://www1.ietf.org/mailman/listinfo/tsvwg
> >

-- 
Randall R. Stewart
Systems & Solutions Engineering
Cisco Systems Inc.
rrs@cisco.com 815-342-5222 or 815-477-2127
References:
- RE: [Tsvwg] [SCTP checksum problems]
  - From: "Douglas Otis" <dotis@sanlight.net>
Prev by Date: RE: iSCSI-06 SCSI Cmd typo
Next by Date: Re: iSCSI : target session login behaviour
Prev by thread: RE: [Tsvwg] [SCTP checksum problems]
Next by thread: Re: [Tsvwg] [SCTP checksum problems]
Index(es):
- Date
- Thread
Home
Last updated: Tue Sep 04 01:04:51 2001
6315 messages in chronological order