SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    iSCSI Naming and Discovery



    The emails from Daniel Smith and Josh Tseng provide
    a good start on a naming discussion.  With
    my co-chair hat off, let me try to add to it.
    
    The concepts surrounding NAT (Network Address Translation)
    figure strongly in this discussion.  RFC 2663 is a
    good source of background for anyone who needs it.
    
    An attempt to group and summarize what I see in
    Daniel and Josh's emails:
    
    [1] Internet infrastructure is beyond the control
    of storage, including iSCSI [Daniel - a),b),f),g),h)]
    [2] DNS is the naming structure of the Internet, and
    hence hosts exporting storage services have to be
    nameable via DNS [Daniel c)].
    [3] Network Address Translation and Network Address
    Port Translation exist and have to be dealt with [Daniel
    d), Josh (1)].
    [4] Discovery at Internet scale is a hard problem.  A
    flexible naming mechanism keeps options open. [Daniel e)].
    [5] SCSI 3rd party commands need to name LUNs, including
    iSCSI LUNs [Daniel i)]
    *[6] Source, destination, and contents of any packet
    on the Internet are public information [Daniel j)].
    *[7] Security matters for SSPs [Daniel k)].
    [8] SSPs can be expected to connect a lot of clients
    to a single DNS address [Daniel l)].
    *[9] Identification of storage and authentication are
    separate problems and use separate names. [Josh (2)]
    [10] Naming/routing information is necessary for proxies
    to identify which of the entities they are proxying for
    is involved in traffic
    
    [1] and [2] are general descriptions of the Internet.
    
    [6], [7], and [9] are (mostly) about security, and aside
    from noting that [6] is incorrect (all of that info can
    be hidden by a security gateway using IPsec tunnels and
    both IPsec and SSL/TLS hide payloads), security discussion
    might be better deferred as I understand that there will
    be a serious security proposal in the next version of the
    iSCSI draft.
    
    The biggest underlying problem seems to be how to identify
    an Initiator or Target - it's part of [3], [5], [8] and
    [10].  [4] is about Discovery, which compounds the naming
    issues.
    
    At a high level there are three basic ways to identify
    initiators and targets:
    - Transport address (e.g., IP address, FC port WWN).
    - Identification information provided as part of session
      establishment (e.g., username/password, certificate).
    - Some combination of the above two.
    The third alternative may be problematic if it leads to
    needing both the transport and identification information
    to determine identity.  The discussions I've seen seem to
    be using transport as a hint that may make the identification
    easier to verify, which seems like a reasonable optimization
    to relying on the identification information alone.  Moving
    beyond this (e.g., the *.xyz.com servers may only connect
    from addresses in the a.b.c/24 netblock) increases the
    amount of information that has to be configured (e.g.,
    that example is better left to firewalls to enforce).
    
    NAT mechanisms contribute to the problem by producing
    networks in which transport addresses aren't useful for
    identifying anything on the other side of the NAT.  Some
    NATs can be configured to make this problem somewhat simpler
    via static assignment of IP addresses in one domain to IP
    addresses in another.
    
    NATs are a thorny subject in IETF - while they are widely
    deployed, not all the important protocols work through them;
    IPsec AH is the most notable example, and FTP requires a kludge
    (er, ah, ALG).  IMHO, restrictions on the use of NATs with iSCSI
    may be ok, provided that the consequences are clearly understood.
    
    NATs seem to be a large piece of the forcing function that is
    leading us away from the transport-based identification
    information used by other SCSI transports.  There are security
    consequences here (e.g., cryptography for session establishment
    may become mandatory to implement and use), and it will likely
    complicate discovery.
    
    For example, I noted that booting is an issue - if iSCSI always
    uses URLs to name storage, the result could be a situation in
    which a DNS server has to be operational and reachable in order
    to boot.  This seems wrong, and the obvious answer of using an
    IP address in a URL does not work through NATs, which
    was the original motivation for using URLs.
    
    Third-party naming is a tarpit.  Putting my WG co-chair hat
    back on for this paragraph only, I observe that global context
    for third party names is an unsolved problem in T10; in
    general, the Initiator of a 3rd party command must use
    names that resolve to the desired LUNs from the 3rd party
    command Target's naming perspective.  How to do this when
    an Initiator and Target don't share a naming context is
    unspecified :-).  While it would be a plus for iSCSI to solve
    this one via have global names for 3rd party commands,
    I don't think this is a requirement (and whatever we do
    will have to be worked through T10, as they have the final
    say on name formats).  WG co-chair hat now comes off ...
    
    IMHO, discovery is not getting enough attention.  The proposed
    naming scheme complicates discovery without a compelling
    solution; I'm concerned that the benefits may not justify the
    costs.  My quasi-random walk through this goes something like:
    
    - Not being able to find the boot volume because DNS is down
    	or not responding seems to be a wrong answer.  The infamous
    	World Wide Wait is bad enough for a browser; it's unacceptable
    	for a reboot.
    - Hence the boot volume has to be locatable via IP, a TCP port
    	(which could be implicit, e.g. the default iSCSI port)
    	and a LUN (which could be implicit, e.g., LUN 0).  Unlike
    	DNS, I could see putting this into an iSCSI HBA card BIOS.
    - Consistency suggests the approach of using IP addresses as
    	the primary means of identifying possible storage locations,
    	with the possible addition of non-default TCP port numbers).
    	A nice consequence of this is that one can use ranges
    	(e.g., all the storage is in netblock a.b.c/24 at the
    	default iSCSI port, scan those 256 addresses as part of
    	discovery on boot).  The corresponding range wildcarding
    	mechanisms for URLs will be more complex.
    
    Even if a centralized configuration repository is used (like Fibre
    Channel), this sort of address wildcarding still looks useful in
    managing the repository.  The netblock example may be too coarse
    a wildcard.
    
    Returning to the issues at the top of this message:
    
    [3] NAT become an issue for network designers/admins.  Storage
    becomes something else that they have to get the IP addressing
    correct for :-(.  There are precedents for this, as the
    default gateway and DNS resolver are already configured via
    IP addresses, and if those things move to different IP
    addresses, stuff breaks (a browser can be very unhappy
    if its host thinks 0.0.0.0 is the only DNS resolver).  The
    downside is that a centralized config repository containing
    IP addresses becomes a NAT issue - the ALG required to
    access that across a NAT is ugly enough that it may be
    necessary to configure the network so that this never
    happens (which is not the best answer, but may be workable).
    
    [4] Discovery based on IP addresses looks like it works for
    boot volumes in a way that URLs don't and scales via wildcarding
    in a fashion superior to URLs.  An underlying assumption
    I'm making is that storage discovery doesn't need to match
    the scale of DNS, and hence centralizing config info isn't
    hobbled by NAT issues.
    
    [5] Use of IP addresses would better match the other 3rd
    party addressing modes, and removes a dependency of the
    third party Target on DNS.  The problems created by NATs
    are similar to problems that already exist in 3rd party
    addressing, and hence this doesn't make things worse.
    
    [8] An IP connection is identified by 2 IP addresses and
    two ports, the fact that several thousand of them go through
    a common DNS, or even a common IP address is not a problem.
    
    With the exception of the comment on T10 and 3rd party
    naming, this is all IMHO.  Fire away ...
    
    --David
    ---------------------------------------------------
    David L. Black, Senior Technologist
    EMC Corporation, 42 South St., Hopkinton, MA  01748
    +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
    black_david@emc.com       Mobile: +1 (978) 394-7754
    ---------------------------------------------------
    
    


Home

Last updated: Tue Sep 04 01:06:52 2001
6315 messages in chronological order