Re: iSCSI: Keep alive

To: ips@ece.cmu.edu
Subject: Re: iSCSI: Keep alive
From: Stephen Bailey <steph@cs.uchicago.edu>
Date: Wed, 08 Nov 2000 08:48:13 -0600
In-Reply-To: Message from "John Hufferd/San Jose/IBM" <hufferd@us.ibm.com> of "Fri, 03 Nov 2000 10:43:01 PST." <OF2174B5C4.C1C276E5-ON8825698A.00807C67@LocalDomain>
Sender: owner-ips@ece.cmu.edu

John Hufferd,

> It has been stated that there may be value in ensuring that a link has gone
> down and re-establish it, without the SCSI application being aware.

A link, or a connection?  If a TCP connection has failed, it implies
that the path has been interrupted in more than a momentary way.  A
retry strategy for a failed TCP connection should be non-aggressive.
As such, being able to reestablish a failed TCP connection without
having to notify the ULP is unlikely.

So, I don't agree that this goal is feasible or desirable, but...

> 1. If for what ever reason an iSCSI session thinks something has gone wrong
> and is contemplating dropping the connection it should Ping before dropping
> a session that maybe down, as a possible confirmation that the link is
> down.  

What this amounts to is a decision by the iSCSI layer that a path
which is `good enough' for TCP is not good enough for the iSCSI
application.

I believe that the iSCSI spec should not have anything to say about
this.  Any path which is good enough for a TCP connection should be
good enough for iSCSI in general.  Of course, specific implementations
can make this choice, but how they determine that the connection is
not good enough implementation dependent.  Maybe it involves an iSCSI
NOP-{In,Out}, or maybe it involves running a little SCSI command, or
something else entirely.

Therefore, I see no requirement to ping before `dropping' (actively
closing) a connection.

> 2. No point to pinging every 10 Seconds or so.

A keep-alive like this should use an exponential backoff, rather than
a constant interval.

One might claim that the only justifiable keep-alive would be
something of this form implemented by a target, comparable to (or
equivalent to) the TCP keep-alive.

The TCP keep-alive allows servers to free connection resources from
clients to whom the path is lost and with whom no operations are
currently outstanding.  In iSCSI the target is the server and the
initiator is the client.  Since the client initiates all exchanges,
including closing the connection in the nominal case, it does not need
such a timer.  In the case where the path has failed and the client
attempts an activity which leads to it discovering the failed path, it
will declare the connection closed.  However, the server has no way of
discovering that the client has declared this, and recover its
connection state.

Is a keep-alive the most efficient and appropriate way in iSCSI to
recover the resources of nonviable idle connections on the server
(target)?

I think that a demand-driven approach is a better idea---when a target
needs to recover connection resources it will initiate a close on an
idle connection.  It leads to much less network traffic than
keep-alives, particularly given common quadratic connectivity
scenarios.

In this case the iSCSI spec only has to ensure that such a target-side
close is not prohibited, and it might mention that target initiated
closes are a way to handle the lost client problem.

> Therefore, a ping at (and only at) the time of suspicion and to avoid a
> inappropriate connection drop is a valid approach.

Again, this seems to focusing on an initiator's desire to
short-circuit the TCP connection management process, which I think is
not an appropriate thing for iSCSI to specify.  I am definitely not
saying that this behavior should be prohibited, the mechanism to do
this is implicitly present in NOP-{In,Out}, but it should not be
specified or discussed in the iSCSI spec.

One peculiar thing to note about pinging with NOP is that it actually
involves 3 or 4 messages, since both NOP-In and NOP-Out will need TCP
ACKs (the NOP-In ACK might be piggybacked).  All the more reason to
go crazy minimizing (or, I say eliminating) any specified pinging.

Steph

References:
- iSCSI: Keep alive
  - From: "John Hufferd/San Jose/IBM" <hufferd@us.ibm.com>

Prev by Date: remove
Next by Date: Re: Keep-alive traffic (was iSCSI: more on StatRN)
Prev by thread: RE: iSCSI: Keep alive
Next by thread: Re: iSCSI: Keep alive
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:06:29 2001
6315 messages in chronological order