Re: iSCSI: "Wedge" drivers

To: Black_David@emc.com
Subject: Re: iSCSI: "Wedge" drivers
From: "John Hufferd/San Jose/IBM" <hufferd@us.ibm.com>
Date: Sat, 26 Aug 2000 16:55:13 -0700
Cc: ips@ece.cmu.edu
Content-type: text/plain; charset=us-ascii
Importance: Normal
Sender: owner-ips@ece.cmu.edu
David,
good write up.  Let me hypothesize some configurations and  approaches,
then ask you some questions and then let you  pick them apart with your
knowledge of your hardware.  And everyone else, see if it all makes iSCSI
sense.

Suppose that a host has only one adapter (NIC), and the failure occurs at
the storage controller in what you call an interface processor (lets
suppose it is an iSCSI capable interface processor). Lets further suppose
that the single NIC had two TCP/IP connections, one TCP/IP connection to
one interface processor and  another TCP/IP connection to another interface
processor.

First we would want information available (perhaps in a LDAP "Naming"
Directory of some type), such that the iSCSI driver would be able to tell
that both TCP/IP connections  could drive commands to the same LUs.   If
that information was there, iSCSI should be able to re-drive commands even
during a link failure (which in this case was caused by the failure of an
interface controller).  I believe this could be made to work with IBM
equipment and EMC's, but do you believe that it will work?

The question of course, is whether this is a valid Session.  That is, the
session is generally defined by the same initiator (with the same ISID)
connecting with the same target.  In this case it is probable that the
target has two IP addresses, but since the target will fill in the TSID
that forms the Session ID (SSID), whether of not the Initiator gets the
information from the LDAP "Naming Directory", the Target can force the
relationship to happen, by returning the same TSID across all the interface
processors.  So that, by definition, should make this a single session with
multiple connections.

Next suppose (using the same host and NIC) it was possible to put a TCP/IP
load balancing switch ahead of the interface processors. Since this switch
exports a single TCP/IP address the Initiator might not fully understand
the connection to the target, however, since the Target sets the TSID, the
Initiator should see this as a valid multiple connection per session.  This
requires the Interface Processors (as above) to return the same TSID.  If
that is done, do you think that, just retrying commands on the alternate
TCP/IP connection -- within the session -- would work with EMC equipment
(assuming that the load balancer drove the command to the surviving
interface processor)?

With the above examples, I am just wondering about you opinion whether a
generic iSCSI Device Driver would do an adequate job of error retry, or
whether there are so many things that need to be done, that only your (in
this case EMC written) Wedge Driver could do the recovery job, and the
types of things that it did would be incompatible with other vendors
hardware.  My general opinion is that it could work with IBM Shark.

If by any chance you think that the iSCSI Device Driver could handle the
failures and retry adequately on the above single NIC adapters, then it
should be possible to have a single session with multiple TCP/IP
connections, that are each on different Host NICs,  and have iSCSI perform
recover just as well (maybe better).  On the other hand if you believe that
iSCSI would not be able to handle the Multi connections per Session on a
single HBA, is there anything  better iSCSI could do in recovery if it had
multiple TCP/IP connections, each on a different NICs, but part of a Single
Session?

I think both approaches could work on IBM Shark (at least for the basic
recovery).  Now of course the devil is in the details and I am sure that I
can hold debates with people envolved and they might take the opposit view.

If on the other hand, you do NOT think iSCSI can be made to adequately
address your interface processor recovery needs, by performing rather
generic recovery techniques, then I would agree with you that a separate
Wedge Driver would be required.
Further, if that is the case, then paths -- which the Wedge sees  --  need
to be known by the Wedge and it must be able to map them to alternate
physical interface processors.  That would mean that only a single
connection per Session would  be useful, along with a single session per
NIC.  But this would also probably mean that vendor specific information
would need to be put into some kind of a data base that the wedge driver
could use, or a lot of administrative work would be needed.  That is, the
LDAP "Name Server" data base, will need to not only have a way for iSCSI
device drivers to access the information, but we will have to generalize
the interface so that applications such as vendor specific Wedge Drivers
can set and access the information.
(On the other hand the Wedge could collate the LU views it gets from each
path to determine which ones are alternates, the way it does today.)

However, since I think it is possible to actually do correct (fundamental)
error recovery, I would not like to remove the multiple connections per
session stuff.  It could be that it never gets used, but I doubt that.  In
any case, since a single connection per session is the "default",  I think
we can leave the Multiple connections per session, and let  vendors exploit
the possibilities of the Multiple Connections/Session, not only for failure
recovery but also for additional performance (via Multiple NICs) without
having vendor specific Wedge Device Drivers.

Having said all that, it still maybe the case that the generic iSCSI Device
Driver can only do fundamental alternate path recovery, and that more
advanced functions are still needed in Vendor specific Wedge Drivers.  So
the question is, just because, the iSCSI Device Drivers can not do all
things, do we prevent it from doing any of the failure recovery etc. by
eliminating the Multiple connections per Session?


.
.
.
John L. Hufferd


Black_David@emc.com@ece.cmu.edu on 08/25/2000 03:01:28 PM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:
Subject:  iSCSI: "Wedge" drivers



I guess I need to take my co-chair hat off and say
a few things about EMC's and other "Wedge" drivers.

Wedge drivers are motivated by fault tolerance and
fail-over in addition to load balancing/spreading.
EMC actually has two products, PowerPath for Symmetrix
(does all three), and ATF for Clariion (fail-over only).
Other products include HDS's SafePath, Veritas's DMP
and HP's PV Links.  Apologies to those I've left out.
John's goal of eliminating of wedge drivers via
iSCSI sessions may not be achievable in practice
due to fault tolerance and fail-over concerns.

The basic fault tolerance/failover concern
arises from the requirement that failure of one
interface processor in the disk array should
not disable all access to the host's storage.
Hence the host to storage connectivity usually
encompasses more than one interface processor,
facing array designers with the choice of
using multiple SCSI connections and keeping
the SCSI state in each interface processor
vs. sharing the SCSI state across multiple
interface processors, and making sure everything
works right when one of them fails in an arbitrary
fashion (ouch).  The former is considerably
easier to implement, and requires a wedge driver
on the host side.  The corresponding issue of
how a host deals with possible failure of an
iSCSI HBA has already been noted, and I agree
that the most likely approach is a wedge driver.

Since John works for IBM, I should note that
the phrase "interface processor(s)" isn't directly
applicable to the IBM Shark array, but nonetheless,
this issue is still present -- for fault tolerance,
storage access has to be spread across both RS/6000
systems running AIX in a Shark, and sharing SCSI
state across AIX instances doesn't sound like an
easy thing to do.

The bottom line is that failure and fault tolerance
concerns make it unlikely (IMHO) that an iSCSI session
concept will lead to the extinction of wedge drivers.

There have been some questions about difficulty of
development of wedge drivers.  EMC's experience
is that fault tolerance and fail-over require much
more effort than load balancing for a couple of
reasons.  The first is that more complex systems tend
to have more complex ways of failing than of working
correctly :-(.  The second is that correct behavior
of a wedge driver depends on the correct behavior of
HBA drivers in error cases, and HBA vendors in
general do not exhaustively test correct behavior
of error cases before releasing drivers -- we
find driver bugs on a regular basis.  Given the
intention to implement much/all of iSCSI in
hardware, I suspect that this situation will
continue relatively unchanged.  I'm not sure what
the latter implies for difficulty of development
of error handling code for multi-connection iSCSI
sessions when a single vendor is responsible for
both the hardware and the driver.

--David

---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140, FAX: +1 (508) 497-6909
black_david@emc.com  Cellular: +1 (978) 394-7754
---------------------------------------------------
Prev by Date: Re: iSCSI: "Wedge" drivers
Next by Date: Re: Connection Consensus Progress
Prev by thread: Re: iSCSI: "Wedge" drivers
Next by thread: RE: iSCSI: "Wedge" drivers
Index(es):
- Date
- Thread
Home
Last updated: Tue Sep 04 01:07:42 2001
6315 messages in chronological order