RE: iSCSI: Keep alive

To: "John Hufferd/San Jose/IBM" <hufferd@us.ibm.com>, <ips@ece.cmu.edu>
Subject: RE: iSCSI: Keep alive
From: "Douglas Otis" <dotis@sanlight.net>
Date: Fri, 3 Nov 2000 12:11:18 -0800
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;charset="iso-8859-1"
Importance: Normal
In-Reply-To: <OF2174B5C4.C1C276E5-ON8825698A.00807C67@LocalDomain>
Sender: owner-ips@ece.cmu.edu

John,

I do not agree with your assessment for the need of establishing a
connection failure timeout.  To achieve a desired detection interval within
the tens of seconds, a mechanism must be in place.  Contemplating dropping,
suspicious, beyond some timeout value fails to provide a deterministic means
of assessing the status of the connection.  I agree there is no point in
pinging every 10 seconds under the following conditions:
	1) There is already communication traffic confirming connection.
	2) The connection is idle and no status is pending.

This makes a SCSI ping every ten seconds a rare event.  It does however
allow expeditious detection of a failure within a time interval suitable for
preventing overlapping retry mechanisms.  As a SCSI ping will be a rare
event used only during periods of very low utilization, requiring the ping
to be serviced by the task process improves the reliability of failure
detection.

Doug

> Team,
> Let me see if I can now boil down the thoughts that have occurred on the
> keep-alive thread and some of my input:
>
> It has been stated that there may be value in ensuring that a
> link has gone
> down and re-establish it, without the SCSI application being aware.
> Assuming that this is valid:
>
> 1. If for what ever reason an iSCSI session thinks something has
> gone wrong
> and is contemplating dropping the connection it should Ping
> before dropping
> a session that maybe down, as a possible confirmation that the link is
> down.  (This is not 100% guarantee that this will always detect that the
> connection is still active, but if responded too, that will guarantee that
> the connection is still up.)
> 2. No point to pinging every 10 Seconds or so.  If nothing is outstanding,
> or missing, then why Ping?  Perhaps the implementation can have time out
> values  etc. that it can use to determine if a suspicious connection is
> still up.  This seems to be an implementation issue.  I think that an
> implementer note in the draft would be sufficient.  Something like "The
> implementation may consider the discovery of dropped connections by use of
> a Ping, at the point the implementation is suspicious of failure. (should
> not be done regularly).  A suspicion maybe raised by an outstanding
> expected response, beyond some Time-out value."
> 3. We do not need to find out if no one is home at the SCSI task layer,
> that is the job of the SCSI Task layer.  We just need to ensure that the
> iSCSI transport is OK.
> 4. It has been pointed out that, the ping can be returned by the HW,
> without respect to things going on higher in the various layers, but that
> with SW implementations, it may be blocked behind other unprocessed stuff.
> This also depends on the implementation of the SW, and the buffer
> handling.
> So I would suggest, that the very most a ping can help with, in the SW
> implementation, is a premature line drop -- by the pinging side -- that
> would have otherwise occurred without the ping.
> 5.The ping can be useful in sorting out the difference between a long SCSI
> operation and a connection hang. That is, if it is a long SCSI response
> time, the ping will return, and the connection dropping can be avoided.
> 6. The only time the adequacy of the above approach  is an issue, is when
> stuff has backed up in the iSCSI buffers and SW TCP/IP buffers --
> undelivered to SCSI -- for such a long time that  it is noticed on the
> other end.  We have other flow control things to control this problem.
> Therefore, a ping at (and only at) the time of suspicion and to avoid a
> inappropriate connection drop is a valid approach.
>
>
> .
> .
> .
> John L. Hufferd
> Senior Technical Staff Member (STSM)
> IBM/SSG San Jose Ca
> (408) 256-0403, Tie: 276-0403
> Internet address: hufferd@us.ibm.com
>

References:
- iSCSI: Keep alive
  - From: "John Hufferd/San Jose/IBM" <hufferd@us.ibm.com>

Prev by Date: RE: FCIP - comments on draft-ietf-ips-fcovertcpip-00.txt
Next by Date: iSCSI Virtualization Draft
Prev by thread: iSCSI: Keep alive
Next by thread: Re: iSCSI: Keep alive
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:06:31 2001
6315 messages in chronological order