iSCSI: CONNECT message (was Naming/Discovery/URLs)

To: ips@ece.cmu.edu
Subject: iSCSI: CONNECT message (was Naming/Discovery/URLs)
From: "Jim Hafner/Almaden/IBM" <hafner@almaden.ibm.com>
Date: Wed, 4 Oct 2000 14:57:19 -0700
Content-type: text/plain; charset=us-ascii
Importance: Normal
Sender: owner-ips@ece.cmu.edu
Folks,

I want to chime in here with a couple of things (well, this is actually
going to be long note with lots of things).  This is based on private
discussions with a couple of people, including Daniel Smith (though I
expect he doesn't agree with all this).

There's a SUMMARY at the end, if you want to skip there and an appendix
with some definitions.

First off, I think Costa has hit one of the nails on the head, but I'd like
to frame things a bit differently and make some specific proposals, pose
some specific (and narrow) questions, move part of this
naming/discovery/etc. issues to a different thread and preachify for a
while (<soapbox>I can't help myself!</soapbox>).

IMHO, the most important (and obvious) first step in getting an initiator
and target together to talk SCSI is the establishment of a TCP connection
(this has nothing, per se, to do with iSCSI as they can't do the login step
until that is done).  Also, the *wire protocol* for this step is
independent of management/naming/etc deployment.  This topic is another
"can of worms" and I don't want to address that one.  So, here goes!

I'm proposing a change to the iSCSI wire protocol that will enable the
establishment of a TCP connection between the initiator and target through
gateways/firewalls, etc.  Details below, but let's start with a question.

QUESTION1: what information does an initiator need to make a connection to
a target?

ANSWER1: it needs at least an ipaddress to open the first connection. It
also needs more, namely an identifier of the target itself, so the receiver
at that ipaddress can tell if this is really for him or for someone else.
So, the initiator needs a pair consisting of (ipaddress, TargetID).  [This
agrees with Costa's "IP address model" paragraph and is what he calls a
"target name".]  N.B. I specifically say ipaddress and not ipname (as I'm
assuming that resolution has already taken place, one way or another).

NOTE 1) I haven't addressed where this information comes from (see Note3
below).

PROPOSAL: the first step in iSCSI phase should be something like "CONNECT:
TargetID".  The iSCSI login should only occur between the initiator and
actual target, after a connection is established end-to-end.

Here's how this works.  The initiator uses the ipaddress in the target name
pair to open a TCP connection.  The first message on that connection is
"CONNECT: TargetID".  This should be interpreted by the receiver as a
request to establish a connection to the device defined by the TargetID (in
the receiver's context).  If the receiver is not the target itself, it must
be some gateway.  The gateway resolves the TargetID into another target
name pair valid on the other side of the gateway.  The gateway then opens a
connection to the new ipaddress and sends the "CONNECT: newTargetID"
message.  This continues and propogates until the target name pair finally
resolves to the target (that is, when the TargetID in the pair belongs to
the same device as that of the ipaddress of the pair).  When the target is
the recepient of the "CONNECT" message for itself, it responds "OK". This
"OK" propogates back through each "hop" back to the initiator.  At this
point, there is an established socket connection (perhaps through multiple
intermediaries) from initiator to target.  Now the iSCSI login can start.

This is analogous to the http GET protocol.  It's different in that each
gateway provides a coupler between two socket connections and must maintain
that state (or be able to reconstruct that state as a long as the "in"-end
stays open.  Such couplers might have security filters, or other things
which are an independent function of the gateway's two-ends.

Let me put this in the context of the existing proposals. In the current
drafts, the target name concept here

a) maps to a URL containing some path information to the target (that is, a
possibly unresolved ipname and path+query information -- the ipaddress in
my target name pair is the "resolved" ipname of this URL; the TargetID is
the rest of the URL).

b) and is embedded in the Text portion of the login message (where it would
have to be parsed in the login phase by the gateway). (This is assumed by
Costa, I think.)

So, I'm proposing then to move this "gateway/proxy/whatever"
connection-intermediary process out of the iSCSI login and into a different
message and as the first phase of the iSCSI wire protocol. It will be more
generic for the gateway to deal with and the information will be available
upfront, not buried in the bowels of the Text portion of the Login.
[Aside: one might suggest that IETF formalize and generalize this sort of
"CONNECT" protocol independent of iSCSI or http or any other protocol.  I'm
not bitting that off here, but it is a thought...]

QUESTION2: what form should the TargetID take?

ANSWER2A: it should be human-readable (e.g., look like a URL).  This has
some advtanges in that some gateways already can parse URLs in http GET
messages, so there's not much extra work.

ANSWER2B: it should be machine-parseable (e.g., a byte-structured set of
fields).  This has its own advantages in  performance and possibly
security.

ANSWER2C: either one, so long as we have some bit in the wire protocol
(header of the "CONNECT" packet?) that tells the gateway what parsing rules
to apply to the TargetID.

I have no strong bias between these choices, so I open that up for
discussion.  My weak bias is for 2B as I think that might be simpler to
implement in gateways.  That does require some serious thinking about
structure of the fields, however.

NOTE 2) the TargetID is a name that is valid in the name/address domain
common to the initiator and the device at the end of the ipaddress (the
gateway).  It may or may not get changed at the gateway.  That is, it is
not necessarily globally unique (though it might contain something with
this property). See QUESTION3.

NOTE 3) how does the initiator (or even the gateway) get the (ipaddress,
TargetID)? That's a management infrastructure issue.  That's NOT an iSCSI
wire protocol, which is all I'm trying to discuss here.
<soapbox> There are an infinite number of ways this might be done,
including combinations of LDAP (or other directory service), DNS,
S(scsi)NS, or anything a management implemention/deployment might choose.
I'd suggest tabling that discussion for now.  It's not clear (to me) that
this is an issue for this WG to deal with, at least for NOW.  E.g., it
could very well be that the initiator gets it from a file or registry in
the OS.  Does IETF want to spec this?</soapbox>

QUESTION3: what's in a TargetID and what's NOT?

ANSWER3: A TargetID contains a name (direct or indirect, globally unique or
not) for the target device.   It is an open question what the contents of
this TargetID should be.  The minimum requirement is that it have meaning
to the receiver of the CONNECT message.
<soapbox> It does NOT contain anything that involves LUs (or LUNs) or
initiator identifiers, or authentication or ..... All of this additional
information is not relevant to establishing the TCP connection.  That stuff
(with the exception of LU information) might be very relevant for the login
step, which comes later.  LU information is not relevant in this space at
all. SCSI knows how to deal with LUs. </soapbox>

QUESTION4:  What about 3rd party issues?

ANSWER4: This is a multi-part answer.

a) As already noted by others, the *only* important issue in this space is
identifiers for 3rd party targets.  LU identifiers (LUNs or Proxy Tokens a
la Access Controls in SCSI) are handled already by SCSI.

b) the target identifier can be arbitrarily long if we assume that T10
adopts some proposal for aliasing long identifiers to 8-byte identifiers
used in third party copy commands like EXTENDED COPY and some of the XOR
commands (as noted by Ralph Weber).

c) within the SCSI third-party command formats (after resolving aliases),
the target identifier should have meaning to the device to which the
third-party command is directed.  So if initiator Ian wants the SCSI copy
manager in target Tom to do stuff on target Tim, then Ian has to identify
Tim to Tom using a target identifier valid from *Tom's perspective* (this
may have no meaning at all for Ian).  Note that this target name may be FC
name (as currently modeled in SCSI's EXTENDED COPY) or it might be an iSCSI
target name pair (ipaddress, TargetID), or it might be a parallel SCSI bus
address (if Tim and Tom are on the same bus).

d) How Ian gets the information about Tom's view of Tim is, again IMHO, a
management infrastructure issue (which I'm defering to another thread). In
other words, it's not a function of the wire protocol or the SCSI protocol,
per se.  There are things we might be able to do to facilitate this (see
discussion point below about TargetID structure), but they belong in a
separate thread.  Keep in mind that this very well might involve
cross-transport issues.

----------------------------------------------------------------------
SUMMARY:

PROPOSAL: first step in iSCSI protocol is "CONNECT", not login. Once
"CONNECT" is successful end-to-end (initiator to target), then the login
can proceed.

ADDITIONAL PROPOSAL: "CONNECT" responses could be one of
a) "OK" (meaning the end-to-end connection has been established)
b) "ERROR" (meaning somebody along the line couldn't establish their hop in
the connection, for whatever reason)
c) "REDIRECT" to a new target name pair (Costa suggests something like
this, as well).

DISCUSSION POINT: what form should the TargetID take (URL, structured-byte
fields, combo)?  Should it contain some globally unique identifier of the
target (if so, who owns that namespace, how is it discovered, and does it
involve security exposures)? Should it contain hints on how to find an
address (the next stop in the hop)? E.g., should it contain a context for
another name server?  What can it contain to help with the third party
problem (Ian getting Tom's context for Tim)?

If there is general (or some) agreement that the main proposal here is a
good idea, I'd be willing to propose a specific message format for the
"CONNECT" protocol and response format.

Can I suggest that discussion of the specifics of the CONNECT protocol come
under this or a narrower discussion thread; discussion of the format of the
TargetID come under a different thread (among those who generally go along
with the CONNECT idea); and discussion of the "how does the initiator get
the target name pair" go under a different thread (like Management or
NameService)?

----------------------------------------------------------------------
APPENDIX:
Two definitiions:

INITIATOR: a "SCSI Initiator Device". In the current T10 thinking, this is
pretty much a physical port on the network which speaks IP and has SCSI
command generator (i.e., application client).

TARGET: a "SCSI Target Device". In the current T10 thinking, this is pretty
much a collection of physical ports on the network, each speaking IP and
each sharing a set of one or more SCSI logical units. That is, this can be
a multi-ported device with multiple ipaddresses or names.

N.B. both definitions are still under debate in T10.


Jim Hafner
Follow-Ups:
- Re: iSCSI: CONNECT message (was Naming/Discovery/URLs)
  - From: csapuntz@cisco.com
Prev by Date: RE: iSCSI: problem with LUN discovery
Next by Date: Re: iSCSI: problem with LUN discovery
Prev by thread: T10 proposals [was SCSI URL scheme....]
Next by thread: Re: iSCSI: CONNECT message (was Naming/Discovery/URLs)
Index(es):
- Date
- Thread
Home
Last updated: Tue Sep 04 01:06:50 2001
6315 messages in chronological order