|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Target Reset handlingJulian, I agree with you that the interactions should be left as simple as possible, but I strongly believe that a response to a target reset is highly desirable and very useful. Besides, my reading of SAM-2 (as quoted below) indicates that a response is required. While SAM-2 leaves the definition of the target reset process to the specific SCSI protocol and interconnect, I would argue that in the case of iSCSI, resetting the entire TCP/IP stack and tearing up the TCP connections should not be necessary. It would seem that the SCSI operational portion (task manager, device server(s), and the tasks themselves) should be affected by a SCSI target reset. As I recall, this is the approach FC target implementations took leaving the process login sessions intact. Having the TCP connections operational would allow us to - - reliably deliver the AEN to all the iSCSI-logged-in initiators. - deliver a response to target reset task mgmt request There's a fair amount of SCSI/LVM software that relies on a reliable target reset confirmation, and an out-of-band confirmation of the same would not go well with it. Regards. -- Mallikarjun M/S 5601 Networked Storage Architecture HP Storage Organization Hewlett-Packard, Roseville. cbm@rose.hp.com >Mallikarjun, > >We assumed that a reset for a controller including such a complex piece >of code/hardware as protocol stack is never complete without resetting it >too. >making AE signalling mandatory is not big help as they might get lost >if done before the connections break. But if the group feels we should make >it mandatory we will. > >For any simple minded (minimalist design) a protocol function >that does what a reset button would do should be enough. > >For complex controllers you can choose to: > >- record the reset results in NV store and report them during the next >session >or even make reading this log in a sense mandatory through an ACA > >- report the event through a vendor specific text response to the next >login > >- use management interfaces (SNMP) > >I personally am inclined to think that the "in-band" rest should be left >as simple as possible (but not simpler!) and out-of-band communication >should >be used to check the process. > >Regards, >Julo > > >"Mallikarjun C." <cbm@rose.hp.com> on 31/08/2000 20:51:30 > >Please respond to cbm@rose.hp.com > >To: ips@ece.cmu.edu >cc: (bcc: Julian Satran/Haifa/IBM) >Subject: Re: Target Reset handling > > > > >James, > >I am in complete agreement with you on all these concerns as they parallel >mine as noted in an earlier posting ( >http://ips.pdl.cs.cmu.edu/mail/msg00348.html), >the relevant portion of which is reproduced below - > >o Section 3.7.1 on page 23. For the Target Reset task management > function, the target is not expected to provide a response - and > this is concerning to me. > - how would the initiator confirm the successful completion of > the target reset? > > - not having a response effectively makes the target reset > an operation iSCSI hardware cannot assist. Having a response > would make it no different from the rest of the iSCSI transactions > and the hardware can gracefully deal with it. > > - SAM-2 (section 6.6, page 63) specifies what a target should > do "Before returning a FUNCTION COMPLETE response". This seems > to me as an implicit requirement that a response be returned > on a Target Reset task management function. > > - I am also unclear as to why the sessions are allowed to be > terminated. In the FC world, as far as I can recall, the > process login sessions remain intact after a target reset. > If it is required that the sessions and the associated TCP > connections be cleared in iSCSI, it is helpful to mandate (as > opposed to leaving it up to the implementation) that an Async > event shall be reported to all the initiators currently logged in. > >-- >Mallikarjun >M/S 5601 >Networked Storage Architecture >HP Storage Organization >Hewlett-Packard, Roseville. >cbm@rose.hp.com > > > > >>In reading the iscsi-01 draft, I was bothered by several things in the >>handling of Target Reset. >> >>a) The lack of at least a basic ACCept on the Target Reset. If the target >>can send an async event, why not at least notify reception of the function >? >> >>Given connections with lots of outstanding traffic, I'd see this as a more >>graceful reset procedure. It allows any outstanding i/o that may be >>completing while the TR is in transit (or queued for processing on the >>target) to do so, possibly lightening the load of i/o that has to error to >>complete. This would potentially quicken the recovery time post reset. I >>would expect this to be more important as the "network" gets larger and >>longer. >> >>Note: FCP does support this behavior. >> >>b) Why not require async events to all initiators ? >> >>The biggest headache with Target Reset is how long it takes for the other >>initiators to recognize the device has been reset. The 1st new i/o will >get >>a Unit Attention CA, but this status is typically seen only by the SCSI >>class driver (e.g. disk/tape/etc). Unless instructed by the class driver, >>the port level driver (e.g. scsi/fc/iscsi hba) will have to timeout the >>i/o's (if timing was requested) to recover their context. If the class >>driver does try to tell the port driver, it typically will do so in a >crude >>fashion - issuing abort requests on the i/o's it knows about. >> >>Perhaps, if the TCP connections are gracefully shutdown between the >>initiator and target, the initiator will be to abort the i/o on the >>connections quickly (in this case, it looks like a pseudo async event). >>However, if there is no handshaking on the connections, my limited >>experience with TCP says it takes a long time for the connection to error >>out and reset. And during this process, we'll be sending i/o abort >requests >>down the terminated-on-one-end connection. All this would make the >recovery >>time on these other intiators very large. >> >>Note: this point assumes that if async events are required - they are >ack'd. >> >>c) Is there something inherent that requires the TCP connections to be >>terminated ? >> >>The TCP connections look very similar to (but not the same as) FCP Process >>logins between the initiator and target. In FCP, the reset did not >>necessarily disrupt the port or process logins. It only had to affect the >>FPC/SCSI task manager. (note: a device was free to really reset, thus >indeed >>tearing down the logins - with the FC port machine handling it as an >error) >> >>What is the background that required the TCP sessions to be broken ? >> >>Obviously, if they are not broken, it affects the answers to (a) and (b) >>above. >> >>d) Given the history of long error recovery times in multi-initiator >>environments in both parallel scsi and fibre channel on BDR's/Target >>Reset's, any speed up in this area would be advantageous. >> >>-- James >> >> >> >>-------------------- >>James Smart >>Trebia Networks, Inc Ph: 978-318-9547 >>35 Forest Ridge Rd Cell: 603-674-3687 >>Concord, MA 01742 james.smart@trebia.com >> >> > > > > > >
Home Last updated: Tue Sep 04 01:07:36 2001 6315 messages in chronological order |