Target Reset handling

To: "ips" <ips@ece.cmu.edu>
Subject: Target Reset handling
From: "James Smart" <james.smart@trebia.com>
Date: Wed, 30 Aug 2000 10:42:48 -0400
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;charset="iso-8859-1"
Importance: Normal
Sender: owner-ips@ece.cmu.edu


In reading the iscsi-01 draft, I was bothered by several things in the
handling of Target Reset.

a) The lack of at least a basic ACCept on the Target Reset. If the target
can send an async event, why not at least notify reception of the function ?

Given connections with lots of outstanding traffic, I'd see this as a more
graceful reset procedure. It allows any outstanding i/o that may be
completing while the TR is in transit (or queued for processing on the
target) to do so, possibly lightening the load of i/o that has to error to
complete. This would potentially quicken the recovery time post reset. I
would expect this to be more important as the "network" gets larger and
longer.

Note: FCP does support this behavior.

b) Why not require async events to all initiators ?

The biggest headache with Target Reset is how long it takes for the other
initiators to recognize the device has been reset. The 1st new i/o will get
a Unit Attention CA, but this status is typically seen only by the SCSI
class driver (e.g. disk/tape/etc). Unless instructed by the class driver,
the port level driver (e.g. scsi/fc/iscsi hba) will have to timeout the
i/o's (if timing was requested) to recover their context. If the class
driver does try to tell the port driver, it typically will do so in a crude
fashion - issuing abort requests on the i/o's it knows about.

Perhaps, if the TCP connections are gracefully shutdown between the
initiator and target, the initiator will be to abort the i/o on the
connections quickly (in this case, it looks like a pseudo async event).
However, if there is no handshaking on the connections, my limited
experience with TCP says it takes a long time for the connection to error
out and reset. And during this process, we'll be sending i/o abort requests
down the terminated-on-one-end connection. All this would make the recovery
time on these other intiators very large.

Note: this point assumes that if async events are required - they are ack'd.

c) Is there something inherent that requires the TCP connections to be
terminated ?

The TCP connections look very similar to (but not the same as) FCP Process
logins between the initiator and target. In FCP, the reset did not
necessarily disrupt the port or process logins. It only had to affect the
FPC/SCSI task manager. (note: a device was free to really reset, thus indeed
tearing down the logins - with the FC port machine handling it as an error)

What is the background that required the TCP sessions to be broken ?

Obviously, if they are not broken, it affects the answers to (a) and (b)
above.

d) Given the history of long error recovery times in multi-initiator
environments in both parallel scsi and fibre channel on BDR's/Target
Reset's, any speed up in this area would be advantageous.

-- James



--------------------
James Smart
Trebia Networks, Inc                  Ph:   978-318-9547
35 Forest Ridge Rd                    Cell: 603-674-3687
Concord,  MA   01742                  james.smart@trebia.com

Follow-Ups:
- Re: Target Reset handling
  - From: "Mallikarjun C." <cbm@rose.hp.com>
- Re: Target Reset handling
  - From: Ralph Weber <ralphoweber@compuserve.com>

Prev by Date: Re: iSCSI: "Wedge" drivers
Next by Date: Re: iSCSI Autosense
Prev by thread: RE: iSCSI: symmetric/asymmetric & "Wedge" drivers
Next by thread: Re: Target Reset handling
Index(es):
- Date
- Thread

Home

Last updated: Tue Sep 04 01:07:37 2001
6315 messages in chronological order