|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] iSCSI: comments on draft
Folks,
Here's some rather extensive comments on the draft. Some are editorial,
some are technical (minor and major) and some are questions. My apologies
for the length. I tried to <snip> as much as possible and still leave
enough context.
<JIM>
The naming and discovery team (NDT) is moving in the direction that
targets may also listen on other TCP ports, so if that is adopted,
this will need to be reworded (as will other parts of this document
when NDT is done.
</JIM>
1.1 <snip>
A “SCSI transportź maps the client-server SCSI protocol to a specific
interconnect. Initiators are one endpoint of a SCSI transport. The
“targetź is the other endpoint. A “targetź can have multiple LUs
behind it. Each logical unit has a number called a LUN.
<JIM>
I would rephrase this as "Each logical unit has an address within a
target called a LUN".
</JIM>
A SCSI task is a SCSI command or possibly a linked set of SCSI
commands. Some LUNs support multiple pending (queued) tasks. The
queue of tasks is managed by the target, though. The target uses an
initiator provided "task tag" to distinguish between tasks. Only one
command in a task can be outstanding at any given time.
<JIM>
I would also be careful throughout to not use the term LUN for logical
unit; I would replace "LUNs" in the second sentence above by "logical
units". [There is one other occurence noted below for this colloqueal
misuse of the term LUN.]
</JIM>
<snip> (still 1.1)
Each SCSI command results in an optional data phase and a response
phase. In the data phase, information can travel from the initiator
to target (e.g. WRITE), target to initiator (e.g. READ), or in both
directions. In the response phase, the target returns the final
status of the operation, including any errors. A response terminates
a SCSI command.
<JIM>
The first sentence can be interpreted that the response phase is
optional. Is this intended or do we want something like "involves an
optional data phase and a required response phase". [Note: is the
word "results" the right one here?]
</JIM>
Command Data Blocks (CDB) are the data structures used to contain the
command parameters to be handed by an initiator to a target. The CDB
content and structure is defined by [SAM] and device class specific
SCSI standards.
<JIM>
"device-type specific" to use the SAM words.
</JIM>
1.2.3 Timers and timeouts
<snip>
<JIM>
Why are the initiator timers mandatory? Isn't it up to the
implementation to decide if there are timing requirements. There is
no target requirement here, so how do you even know this is working?
</JIM>
<snip>
1.2.6 iSCSI Full Feature Phase
Once the initiator is authorized to do so, the iSCSI session is in
iSCSI full feature phase. The initiator may send SCSI commands and
data to the various LUNs on the target by wrapping them in iSCSI
<JIM> LUNs->logical units </JIM>
<snip>
An initiator MAY request, at login,
to send immediate data blocks of any size. If the initiator requests
a specific block size the target MUST indicate the size of immediate
data blocks it is ready to accept in its response. Beside iSCSI,
SCSI also imposes a limit on the amount of unsolicited data a target
is willing to accept. The iSCSI immediate data limit MUST not exceed
the SCSI limit.
<JIM>
We should give a reference to where this limit is defined and
specified in the SCSI world (Mode Page 02h, disconnect/reconnect page,
First Burst Size) in SPC-2.
</JIM>
1.2.7 iSCSI Connection Termination
Connection termination is assumed an exceptional event.
Graceful TCP connection shutdowns are done by sending TCP FINs.
Graceful connection shutdowns MUST only occur when there are no
outstanding tasks that have allegiance to the connection. A target
SHOULD respond rapidly to a FIN from the initiator by closing it's
half of the connection as soon as it has finished all outstanding
tasks that have allegiance to the connection. Connection termination
with outstanding tasks may require recovery actions.
<JIM>
Can/should this have some definition of what "finish all outstanding
tasks means"? E.g., Abort tasks -- if you don't abort, where are you
going to send status?
</JIM>
2.2 SCSI Command
<snip>
<JIM>
Is there a strong reason to put the Bidi stuff AFTER all the other
stuff and not within the context of the main header (similar question
as here in the context of the response PDU) or to not have it like
FCP-2 where the DL field is *after* the CDB and the proposed FCP-2 has
the Bidi-READ DL field after that normal DL field?
It looks kludgy (spelling?) to have a separation of the two DL fields
by other stuff.
</JIM>
2.2.6 CDB - SCSI Command Descriptor Block
There are 16 bytes in the CDB field to accommodate the largest
currently defined CDB. Whenever larger CDBs are used, the CDB
spillover MAY extend beyond the 48-byte header.
<JIM>
There are larger than 16byte CDBs defined already (see SPC-2 and SBC-2).
Perhaps a better phrasing is "to accomodate the size of the most
commonly used CDBs".
</JIM>
2.3.1 Byte 1 - Flags
<snip>
b4-6 not used (SHOULD be set to 0)
<JIM>
Should we use the T10 style "reserved" model for unused bits and
require that the target check that these are zero, so that in the
future, if a definition is given to them, we won't have to worry about
bad initiators that didn't initialize these bits correctly?
Related, what's the error path for the target if something is wrong
(like "o" and "u" both set? (See next item.)
Note: initiator's don't need to care about bad fields because there's
nothing they can do about it!
</JIM>
<snip> (this next is on Task Management)
2.6.1 Function
<snip>
For the <Clear Task Set> the target MUST send an Asynchronous Event
to all other attached initiators to inform them that all pending
tasks are cancelled and then enter the ACA state for any initiator
for which it had pending tasks.
<JIM>
So we are requiring AE? All other protocols don't (AFAIK).
Also, we should be aware of the new SCSI status that has been approved
for SAM-2, called TASK ABORTED. This is used (under certain conditions
to deal with legacy hosts) to inform an initiator that its tasks were
aborted by the actions of another initiator. I think that an
initiator that has requested the TASK ABORTED status (via Control Mode
Page) should NOT be given the AE and it should be handled by this new
status.
I'd like to hear from other people (particularly others who are closer
to SAM-2) on the need for AE in this case as well.
</JIM>
<snip>
In addition, for the <Target Cold Reset> the target then MUST
terminate all of its TCP connections to all initiators (all sessions
are terminated). However, if the target finds that it cannot send the
required response or AE it MUST continue the reset operation and it
SHOULD log the condition for later retrieval.
<JIM>
Are we spec'ing specifics on the content of this "log" and on the
methods for retrieving that log? And who can ask for the log?
</JIM>
2.8 SCSI Data
<snip>
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0|F| 0x05 |1|0| Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8| LUN or Reserved (0) |
12| |
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Target Task Tag (solicited) or Reserved (0) (unsolicited) |
+---------------+---------------+---------------+---------------+
24| Reserved (0) |
+---------------+---------------+---------------+---------------+
28| ExpStatRN |
+---------------+---------------+---------------+---------------+
32/ Reserved (0) /
/ /
+---------------+---------------+---------------+---------------+
40| Buffer Offset |
+---------------+---------------+---------------+---------------+
44| Reserved (0) |
+---------------+---------------+---------------+---------------+
48/ Payload /
+/ /
+---------------+---------------+---------------+---------------+
<JIM>
This specifies that bytes 8-15 are LUN or Reserved (0). Which is it?
Under what conditions is the LUN required? What happens if the LUN
doesn't match the one that task tag implies (or is that not a
problem?)?
</JIM>
<snip>
2.15 Map Command
Byte / 0 | 1 | 2 | 3 |
/ | | | |
|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
+---------------+---------------+---------------+---------------+
0|F| 0x07 |1|0| Function | Reserved (0) |
+---------------+---------------+---------------+---------------+
4| Length |
+---------------+---------------+---------------+---------------+
8| Reserved (0) |
+ +
12| |
+---------------+---------------+---------------+---------------+
16| Initiator Task Tag |
+---------------+---------------+---------------+---------------+
20| Reserved (0) |
+---------------+---------------+---------------+---------------+
24| CmdRN |
+---------------+---------------+---------------+---------------+
28| ExpStatRN |
+---------------+---------------+---------------+---------------+
32/ Reserved (0) /
+/ /
+---------------+---------------+---------------+---------------+
48| Descriptor Type | Descriptor Length |
+---------------+---------------+---------------+---------------+
52/ Descriptor /
+/ /
+---------------+---------------+---------------+---------------+
---------------------------------------------------------------------
+---------------+---------------+---------------+---------------+
| Descriptor Type | Descriptor Length |
+---------------+---------------+---------------+---------------+
/ Descriptor /
+/ /
+---------------+---------------+---------------+---------------+
or
+---------------+---------------+---------------+---------------+
48| 8 byte Descriptor |
+| |
+---------------+---------------+---------------+---------------+
---------------------------------------------------------------------
+---------------+---------------+---------------+---------------+
N | 8 byte Descriptor |
+| |
+---------------+---------------+---------------+---------------+
<JIM>
I don't understand the format of this command, especially in the "or"
case. This looks like just a list of 8 byte descriptors. To what do
they MAP?
</JIM>
<snip>
2.15.1 Function
<snip>
Address/access control descriptors follow the header. For the map
function the following descriptor types are defined:
0 Binary IP Version 4 TCP address (IP+Port) followed by a
selector string; length should be 6+the selector length+1
1 Binary IP Version 6 TCP address (IP+Port) followed by a
selector string; length should be 18+the selector length+1
2 iSCSI URL (domain name terminated with null followed by a
selector followed by null)
3 FC address & port - in case access control is based on
transport ID
4 access proxy token
Details for 3 & 4 have to be coordinated with T10
For the unmap function the descriptors are standard 8 byte SRAs (SCSI
Reference Address)
<JIM>
Where are the SRA's specified in this format? Am I really missing
something?
There is only one reason for this Map command (AFAIK). Namely, to map
long ipstyle addressing mechanisms (e.g., IPv6) to a smaller 8byte
alias for the purposes of third party addressing in SCSI commands like
EXTENDED COPY and some of the XOR commands. The limitation of the
current spec (SPC-2) on those commands is that the Target identifier
is only 8 bytes long (and cannot be extended). Specifically, the only
need is for name or address resolution of the Target Device and NOT
for the logical unit (that is already handled by other 8 byte fields
in the target descriptors).
SPC-2 and SPC-3 already have descriptors for FC address & port (SPC-2)
and IPv4 (just approved for SPC-3) so there is no need for that here.
Note also that the IPv4 target descriptor included a 2byte field for
"Protocol" (that is, UDP, TCP, etc.).
I firmly believe that T10 will approve a SCSI version of this command
(very soon; there is definite movement in this direction as there is
need for this both in iSCSI and in SRP (formerly known as SVP)) so
that this is NOT needed at all; I personally recommend removing this
so as not to create confusion later.
I may be uninformed, but what is a "selector string"?
Are maps initiator specific or global for the target? Are they
volatile? Under what conditions can the target clean out its map
table? Can I blow away someone else's map values? Can I query the
mapping table, either for the entire table or for the specific mapping
of a particular SRA?
I don't know what access control information is relevant here. In
particular:
-- What does FC address & port mean in this context for access
controls based on TransportID (note the spelling as well)?
-- There is *no need* or value in access proxy token in this context.
That is handled *completely* in the SCSI Access Controls as approved
by T10. Proxy Tokens are indirect handles to identify a logical unit
and NOT for identification of a Target Device.
</JIM>
<snip>
2.21 Third Party Commands
There are some third-party SCSI commands, such as (EXTENDED) COPY and
COMPARE that involve more than one target. In it's most general form
those commands involve the "original target" called the COPY-Manager
and a (variable) number of other machines called source and
destination. The whole operation is described by one "master CDB"
delivered to the Copy manager and a series of descriptor blocks; each
descriptor block addresses a source and destination target and LU and
a description of the work to be done in terms of blocks or bytes as
required by the device types. The relevant SCSI standards do not
require full support of the (EXTENDED) COPY or COMPARE nor do they
provide a detailed execution model. We will assume, in the spirit of
[SPC-2], that a COPY manager will read data from a source and write
them to a destination.
To address them an iSCSI COPY manager will use information provided
to it through map commands and the SRAs and flags provided in the
descriptors - allowing for iSCSI and FC sources and destinations.
Enabling a FC COPY manager to support iSCSI sources and destinations
is subject to coordination with T10.
<JIM>
Note that COPY and COMPARE have now been made Obsolete in SPC-2 so
there probably isn't a reason to mention them here.
The language of the first paragraph reads to me like an editorial
comment on SPC-2. I would suggest wording for this section that more
closely resembles that of FCP-2.
</JIM>
<snip>
6.2.2.4 Encryption
This mode provides for the end-to-end encryption (e.g. IPsec). In
addition to authenticating the client, it provides end-to-end data
integrity and protects against man-in-the-middle attacks,
eavesdropping, message insertion, deletion, and modification.
<JIM>
I thought IPsec provides only link-level security and NOT end-to-end.
Am I wrong?
</JIM>
<snip>
Appendix A
02 Authentication
<snip>
The authentication methods to be used are public key, user/password
or challenge/response.
<JIM>
We don't allow for Kerberos in here (or something like that)?
</JIM>
If public key is selected then each party MUST use:
authenticate:<user-id>,<blob>
where user-id is the SCSI access-id of the host-OS for the initiator
or the World-Wide-Name for the target and blob is the public-key
blob.
<JIM>
As the author of the SCSI access controls, I am personally not
comfortable with use of the AccessID (note the spelling) in this
context for authentication. Note that AccessIDs are NOT used for
security in SCSI, only for identification. For layering reasons and
others, I feel that iSCSI security should be based on independent
principles between the iSCSI entities and NOT on SCSI related
concepts.
Additionally, AccessIDs are NOT required for initiators in SCSI and so
cannot (should not?) be required here.
Finally, let me mention that a (weak) reason to use AccessIDs in this
context is that a given target may decide to reject a login for a
particular AccessID if that initiator has no accessible logical units.
On the other hand, the initiator may have two other reasons (in the
context of SCSI access controls) for connecting to a target *even if*
that initiator has no accessible logical units. These include:
1) the initiator needs to deliver a SCSI ACCESS CONTROL IN/OUT command
to that target to query or change access controls (weak authentication
of the CDB itself is done with a "key" embedded in the command or the
parameter data).
2) the initiator holds a proxy token that (indirectly) references a
logical unit on that target (this token is also embedded in the CDB or
parameter data).
So, either we include these additional tags in parallel with AccessID
for the purposes of authentication OR we don't use any of them.
I vote for not using any of them for this purpose. I don't object to
including a key:value for these three things as additional data in
login (especially after the security context has been established) but
I think they need to be divorced from the authentication proceedure
completely.
</JIM>
Jim Hafner
IBM Research
Home Last updated: Tue Sep 04 01:06:27 2001 6315 messages in chronological order |