Re: iSCSI Naming and Discovery

To: ips@ece.cmu.edu
Subject: Re: iSCSI Naming and Discovery
From: "John Hufferd/San Jose/IBM" <hufferd@us.ibm.com>
Date: Tue, 3 Oct 2000 16:09:16 -0700
Content-type: text/plain; charset=us-ascii
Importance: Normal
Sender: owner-ips@ece.cmu.edu
David Black,

Could you please expand (a lot) on the following statement (I am not sure I
followed you):

"[4] Discovery based on IP addresses looks like it works for
boot volumes in a way that URLs don't and scales via wildcarding
in a fashion superior to URLs.  An underlying assumption
I'm making is that storage discovery doesn't need to match
the scale of DNS, and hence centralizing config info isn't
hobbled by NAT issues."

.
.
.
John L. Hufferd


Black_David@emc.com@ece.cmu.edu on 10/03/2000 03:03:19 PM

Sent by:  owner-ips@ece.cmu.edu


To:   ips@ece.cmu.edu
cc:
Subject:  iSCSI Naming and Discovery



The emails from Daniel Smith and Josh Tseng provide
a good start on a naming discussion.  With
my co-chair hat off, let me try to add to it.

The concepts surrounding NAT (Network Address Translation)
figure strongly in this discussion.  RFC 2663 is a
good source of background for anyone who needs it.

An attempt to group and summarize what I see in
Daniel and Josh's emails:

[1] Internet infrastructure is beyond the control
of storage, including iSCSI [Daniel - a),b),f),g),h)]
[2] DNS is the naming structure of the Internet, and
hence hosts exporting storage services have to be
nameable via DNS [Daniel c)].
[3] Network Address Translation and Network Address
Port Translation exist and have to be dealt with [Daniel
d), Josh (1)].
[4] Discovery at Internet scale is a hard problem.  A
flexible naming mechanism keeps options open. [Daniel e)].
[5] SCSI 3rd party commands need to name LUNs, including
iSCSI LUNs [Daniel i)]
*[6] Source, destination, and contents of any packet
on the Internet are public information [Daniel j)].
*[7] Security matters for SSPs [Daniel k)].
[8] SSPs can be expected to connect a lot of clients
to a single DNS address [Daniel l)].
*[9] Identification of storage and authentication are
separate problems and use separate names. [Josh (2)]
[10] Naming/routing information is necessary for proxies
to identify which of the entities they are proxying for
is involved in traffic

[1] and [2] are general descriptions of the Internet.

[6], [7], and [9] are (mostly) about security, and aside
from noting that [6] is incorrect (all of that info can
be hidden by a security gateway using IPsec tunnels and
both IPsec and SSL/TLS hide payloads), security discussion
might be better deferred as I understand that there will
be a serious security proposal in the next version of the
iSCSI draft.

The biggest underlying problem seems to be how to identify
an Initiator or Target - it's part of [3], [5], [8] and
[10].  [4] is about Discovery, which compounds the naming
issues.

At a high level there are three basic ways to identify
initiators and targets:
- Transport address (e.g., IP address, FC port WWN).
- Identification information provided as part of session
  establishment (e.g., username/password, certificate).
- Some combination of the above two.
The third alternative may be problematic if it leads to
needing both the transport and identification information
to determine identity.  The discussions I've seen seem to
be using transport as a hint that may make the identification
easier to verify, which seems like a reasonable optimization
to relying on the identification information alone.  Moving
beyond this (e.g., the *.xyz.com servers may only connect
from addresses in the a.b.c/24 netblock) increases the
amount of information that has to be configured (e.g.,
that example is better left to firewalls to enforce).

NAT mechanisms contribute to the problem by producing
networks in which transport addresses aren't useful for
identifying anything on the other side of the NAT.  Some
NATs can be configured to make this problem somewhat simpler
via static assignment of IP addresses in one domain to IP
addresses in another.

NATs are a thorny subject in IETF - while they are widely
deployed, not all the important protocols work through them;
IPsec AH is the most notable example, and FTP requires a kludge
(er, ah, ALG).  IMHO, restrictions on the use of NATs with iSCSI
may be ok, provided that the consequences are clearly understood.

NATs seem to be a large piece of the forcing function that is
leading us away from the transport-based identification
information used by other SCSI transports.  There are security
consequences here (e.g., cryptography for session establishment
may become mandatory to implement and use), and it will likely
complicate discovery.

For example, I noted that booting is an issue - if iSCSI always
uses URLs to name storage, the result could be a situation in
which a DNS server has to be operational and reachable in order
to boot.  This seems wrong, and the obvious answer of using an
IP address in a URL does not work through NATs, which
was the original motivation for using URLs.

Third-party naming is a tarpit.  Putting my WG co-chair hat
back on for this paragraph only, I observe that global context
for third party names is an unsolved problem in T10; in
general, the Initiator of a 3rd party command must use
names that resolve to the desired LUNs from the 3rd party
command Target's naming perspective.  How to do this when
an Initiator and Target don't share a naming context is
unspecified :-).  While it would be a plus for iSCSI to solve
this one via have global names for 3rd party commands,
I don't think this is a requirement (and whatever we do
will have to be worked through T10, as they have the final
say on name formats).  WG co-chair hat now comes off ...

IMHO, discovery is not getting enough attention.  The proposed
naming scheme complicates discovery without a compelling
solution; I'm concerned that the benefits may not justify the
costs.  My quasi-random walk through this goes something like:

- Not being able to find the boot volume because DNS is down
     or not responding seems to be a wrong answer.  The infamous
     World Wide Wait is bad enough for a browser; it's unacceptable
     for a reboot.
- Hence the boot volume has to be locatable via IP, a TCP port
     (which could be implicit, e.g. the default iSCSI port)
     and a LUN (which could be implicit, e.g., LUN 0).  Unlike
     DNS, I could see putting this into an iSCSI HBA card BIOS.
- Consistency suggests the approach of using IP addresses as
     the primary means of identifying possible storage locations,
     with the possible addition of non-default TCP port numbers).
     A nice consequence of this is that one can use ranges
     (e.g., all the storage is in netblock a.b.c/24 at the
     default iSCSI port, scan those 256 addresses as part of
     discovery on boot).  The corresponding range wildcarding
     mechanisms for URLs will be more complex.

Even if a centralized configuration repository is used (like Fibre
Channel), this sort of address wildcarding still looks useful in
managing the repository.  The netblock example may be too coarse
a wildcard.

Returning to the issues at the top of this message:

[3] NAT become an issue for network designers/admins.  Storage
becomes something else that they have to get the IP addressing
correct for :-(.  There are precedents for this, as the
default gateway and DNS resolver are already configured via
IP addresses, and if those things move to different IP
addresses, stuff breaks (a browser can be very unhappy
if its host thinks 0.0.0.0 is the only DNS resolver).  The
downside is that a centralized config repository containing
IP addresses becomes a NAT issue - the ALG required to
access that across a NAT is ugly enough that it may be
necessary to configure the network so that this never
happens (which is not the best answer, but may be workable).

[4] Discovery based on IP addresses looks like it works for
boot volumes in a way that URLs don't and scales via wildcarding
in a fashion superior to URLs.  An underlying assumption
I'm making is that storage discovery doesn't need to match
the scale of DNS, and hence centralizing config info isn't
hobbled by NAT issues.

[5] Use of IP addresses would better match the other 3rd
party addressing modes, and removes a dependency of the
third party Target on DNS.  The problems created by NATs
are similar to problems that already exist in 3rd party
addressing, and hence this doesn't make things worse.

[8] An IP connection is identified by 2 IP addresses and
two ports, the fact that several thousand of them go through
a common DNS, or even a common IP address is not a problem.

With the exception of the comment on T10 and 3rd party
naming, this is all IMHO.  Fire away ...

--David
---------------------------------------------------
David L. Black, Senior Technologist
EMC Corporation, 42 South St., Hopkinton, MA  01748
+1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
black_david@emc.com       Mobile: +1 (978) 394-7754
---------------------------------------------------
Prev by Date: RE: SCSI URL scheme
Next by Date: RE: iSCSI Naming and Discovery
Prev by thread: RE: iSCSI Naming and Discovery
Next by thread: Re: iSCSI Naming and Discovery
Index(es):
- Date
- Thread
Home
Last updated: Tue Sep 04 01:06:51 2001
6315 messages in chronological order