SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: iSCSI: remove recovery from transport-layer connection failure(?)



    Steph,
    
    Any IP-SCSI spec could be seen both as a connection to a controller and as
    bridge to existing drives.  With 5+ milli-second network latency,
    controllers will remain adjacent to the client as a means of protection
    against this latency in much the same manner a controller protects from
    drive latency.  As both modes of operation are legitimate, assumptions about
    transport should be tempered by these possibilities.  SCTP does provide for
    a more immediate recovery.  It would also be irresponsible to promote
    modification to TCP to support features already found within SCTP.
    
    Doug
    
    > -----Original Message-----
    > From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of
    > Stephen Bailey
    > Sent: Tuesday, September 26, 2000 9:16 PM
    > To: ips@ece.cmu.edu
    > Subject: Re: iSCSI: remove recovery from transport-layer connection
    > failure(?)
    >
    >
    > > Currently, iSCSI is spec'ed to recover from transport-layer
    > > connection failures.
    > >
    > > The main motivation for this decision was to support tape backup
    > > applications that are quite sensitive to any failures that get
    > > propogated to their layer.
    > >
    > > So, perhaps we can remove the requirement of recovering from
    > > transport-layer connection failures in iSCSI. This would simplify
    > > the protocol somewhat.
    > >
    > > Thoughts?
    >
    > I'm all for eliminating command recovery.
    >
    > There seem to be several reasons advanced for command recovery.
    >
    > The first seems to be based upon an inappropriate analogy to FCP.
    > Command recovery had to be added to FCP-2 because the FC layer is
    > unreliable.  A single dropped FC frame leads to a failed FCP command.
    > This clearly upsets tape operation even when the link is performing
    > nominally.  In FCP, without command recovery, with some observable
    > frequency, you will get an expected error that leads to complete,
    > irrecoverable failure of a transfer stream.  The other thing that
    > makes FCP-2 command recovery work well is when you are doing a write,
    > which is 90% (maybe it's 99%?) of tape operation, the target can
    > return an early indication of most frame drops, rather than waiting
    > for a timer to expire.
    >
    > TCP's reliability solves this problem in another way.  By the time you
    > get a TCP connection failure, you have already exhausted a set of
    > reliability mechanisms which guarantee, with high certainty, that
    > further data can not be transferred between the two endpoints.
    >
    > `the two endpoints' phrase suggests the other reason advanced for
    > command recovery.  That is, to permit path failover for commands which
    > are not idempotent, such as tape write sequential.  The
    > problem with this, is that it is not clear HOW iSCSI command recovery
    > can actually work properly, given a TCP connection failure indication.
    > It takes a long time for a TCP connection to fail, and by that time,
    > I'm not sure recovery would reasonably be possible.  Perhaps I'm in
    > error on this assumption.  Can a tape guru (Joe from Exabyte?) comment
    > on whether recovery would be possible after many seconds (tens,
    > hundreds) have elapsed?
    >
    > The SCSI layer has never been solely responsible for ensuring reliable
    > backup.  Macro scale things go wrong with tape (run off the end, get
    > eaten, etc..) with relatively high frequency.  A low level backup
    > engine like tar or dump will fail on a SCSI error, and that's OK.
    > There must also be a higher level software component like Amanda,
    > which manages retries, including operator intervention, to ensure
    > reliable backup.
    >
    > It seems like whether iSCSI has a command recovery mechanism should be
    > a function of whether somebody can stand up and say for sure that it
    > solves a real problem.  So far it only seems like it MIGHT solve a
    > problem.  Who can say `this solves MY problem!'?
    >
    > Steph
    >
    
    


Home

Last updated: Tue Sep 04 01:07:02 2001
6315 messages in chronological order