RE: iSCSI: response to second login (with same ISID)

To: "'julian_satran@il.ibm.com'" <julian_satran@il.ibm.com>
Subject: RE: iSCSI: response to second login (with same ISID)
From: "Martin, Nick" <Nick.Martin@compaq.com>
Date: Tue, 29 May 2001 01:32:22 -0500
Cc: ips@ece.cmu.edu
Content-Type: text/plain;charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu

Julian,

Yes, there is nothing preventing a target from being "well behaved", there
is also nothing to assure the initiator that any particular target will be
"well behaved".  At the moment, I think we do not have a definition of "well
behaved" in terms of how long it should take a target to discover loss of
connection to an initiator.  Thus some folks are sure it will be "too long"
and that we need a way to circumvent waiting for it.  IMHO We should either
agree in advance on such parameters for all targets (if we could), or allow
all initiators to set, or at least query some behavior defining parameter
values used by their targets.

How often should a "well behaved" iSCSI target test each of its connections?
How long should a target wait for the response to a "NOP-OUT" (a.k.a. iSCSI
ping) before presuming the connection failed?  Is more than one attempt
required?  If this behavior should be configurable, should it be
configurable by the administrator of the target, or the administrator of the
initiator, or negotiated?  (For leased storage, these two administrators may
not even work for the same employer.)  To whom should it be visible/known?

I presume any connection is considered tested&OK for N seconds after the
latest valid PDU arrives.  Is a value for N discussed?  How long before or
after N seconds of inactivity should we begin testing the connection?  Does
this apply equally to targets and initiators listening on their iSCSI
connections?

If I interpret correctly, we presently have some parameters (negotiated) for
how long after detecting a connection failure, the target must preserve the
session state to allow the initiator to attempt to reconnect.

It may be that something already dictates "it will take 2.5 to 3.0 minutes"
(seconds? hours?) from the moment of initiator cease function until target
detection of connection failure, and this is an inherent characteristic of
iSCSI.  If this should be the case, then some folks will say those times are
fine with me, others may say that's too long.  Those who can not live with
our protocol can deploy something else.  However, I do not see what is
currently dictating how long this should or could take.  I hope to not
exclude applications with reasonably strict recovery time requirements.  I
also hope not to ping every second on every idle iSCSI connection only to
hasten recovery for a few applications.

Thanks,
Nick
-----Original Message-----
From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
Sent: Saturday, May 26, 2001 7:42 AM
To: ips@ece.cmu.edu
Subject: RE: iSCSI: response to second login (with same ISID)




Martin,

I agree that it would be bad if at login an initiator would have to wait
for a long time for the target
to detect if the old session has gone away.   But there is nothing in  the
current spec that will
prevent a good target implementation doing what you describe whithout any
negotiation.

Regards,
Julo

"Martin, Nick" <Nick.Martin@compaq.com> on 25-05-2001 23:22:11

Please respond to "Martin, Nick" <Nick.Martin@compaq.com>

To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
cc:
Subject:  RE: iSCSI: response to second login (with same ISID)




Julian,

I do not know how long it would take a target using a ping to determine
that
an old session is really dead.  Some folks think it would be too long, thus
they want to be able to "login unconditionally".  Primarily I thought this
would be used for initiators rebooting after ungraceful shutdown.  (For
those who can otherwise reboot quickly, but would stall on each iSCSI
session not already recovered by the target.)

I do not expect to want to use a "login unconditionally" option.  (I can
handle rejection ;)
The scary thing to me was if an initiator could not try an ISID which was
in
use (intentionally, accidentally, or erroneously) without disrupting
(logging out) its current user (a.k.a. option 2).
It appears that if given a "login unconditionally" option, some folks would
use it exclusively.

If the initiator can specify the maximum time it would take a target to
notice dead connections and sessions, then the argument that waiting for
target recovery of the session would take too long presumably goes away.

I am thinking about login negotiated parameters like MinIdleTime and
MaxIdleTime values in seconds.  These could specify respectively the
connection idle interval after which a NO-OP (keep-alive) should be sent
and
expected, and the time with no traffic received on this connection which
should be regarded as an implicit logout request due to connection failure.
(This could cause an async event if detected by the target and a working
connection remains in this session.)

Better names can be chosen.  Reasonable defaults might be 10 seconds and 60
seconds.  There should be a recommendation like MaxIdleTime should always
be
at least 3 times MinIdleTime.  Such parameters would probably exist within
the Target and the Initiator, even if they are not negotiated or exchanged.
The intent is that an Initiator that cares, can limit Target connection
recovery delays.

These parameters set by the Initiator for the Target, would be known to
both.  A broken connection could be detected by each at approximately the
same time.  The keep-alive NO-OP's could be pings so that both directions
would be kept alive by a single exchange.  If the Initiator wants to be
responsible for keep-alives, it should do them before the specified
MinIdleTime, if it wants the Target to do them, it may need to do them
also,
slightly after the specified MinIdleTime, but well before MaxIdleTime.

My networking is weak, so I am not sure whether there should be a separate
(shorter) time specified for the maximum time to wait for the reply to a
ping, or how long to wait to retry it.  I am not clear on the interactions
with TCP and its timeouts and retries.  Is there a maximum time a ping
could
take and still get through?  Is there a maximum time a busy iSCSI target or
initiator would be expected to delay before sending a reply to a ping?

Does this sound like a good idea at all or not?

Thanks,
Nick

Prev by Date: Re: iSCSI: multiple intiaitor conflicting with target (was Re: iSCSI:response to second login (with same ISID)
Next by Date: Re: iSCSI - opcodes
Prev by thread: Re: iSCSI: multiple intiaitor conflicting with target (was Re: iSCSI:response to second login (with same ISID)
Next by thread: iSCSI: Login State Tables
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:04:35 2001
6315 messages in chronological order