Comments on the Draft

To: ips@ece.cmu.edu
Subject: Comments on the Draft
From: "Hall, Howard" <howard@pirus.com>
Date: Tue, 12 Sep 2000 19:12:38 -0400
Content-Type: text/plain;charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Here are some detailed questions, issues, and proposals we have come up with
after reviewing the draft closely.

-Howard

Howard Hall
Pirus Networks
www.pirus.com

-----------------------

Section: General

We need to explicitly describe in the draft how to handle loss of iSCSI
parsing synchronization on a TCP stream.
Resynchronization can be accomplished by shutting down the TCP connection.
If it's the command stream that gets reset, iSCSI needs to "remember" an
error code for this session  (for how long?), so that it can let its client
know what happened when the command session gets reestablished.
Similarly, suggestions should be made on how to respond when the other side
sends messages that don't make sense, i.e.: a target sends bogus RTT
messages, or repeated "opcode not understood" 
We propose that the draft, in general, should state that initiators and
targets should 'play safe' - especially avoiding bogus reads and writes.
Initiators can use the 'escalating big hammer' approach, terminating the
task, terminating the connection/session, resetting the device. 

Finally, what should a target do if it detects inconsistencies in the data
buffers for a write command e.g.. offsets over end of total transfer size,
offset overlaps, data portion greater than total transfer length, transfer
tags which don't correspond with any in progress etc. What should an
initiator do when it decides that at target is behaving illegally?  For
example, what if a 512 byte write is sent and then we receive an RTT for
offset 2048?  
We propose that the draft specifies to abort the write command and send a
response with the ISCSI status set to 1 ( iscsi check ).   Better yet -
since we have 8 bits of iSCSI status, why not have more values than just
"good" and "iscsi check" (e.g. not_logged_on, data_in_timeout,
buffer_address_inconsistency).  After seeing a couple consecutively it
should shut down the connection or better yet send the proposed "Out of
Sync" command described below.

Section: General
There's no framing of the headers and data on the buffers from TCP. If
anything goes wrong with the parsing, its difficult if not impossible to
recover. It only takes one length field to be 'off'. If this happens the
target will probably generate lots of "Opcode not understood" messages.   We
suggest one of two methods: 1) after seeing consecutive "Opcode not
understood" messages it should shut down the connection if this doesn't
solve the problem then reset the target, or 2)  When the target finds that
it is out of sync with the initiator ( on receipt of an "Opcode not
understood"), it will send a new iSCSI "Out of Sync" command to the
initiator.  The initiator will assume at the reception of the "Out of Sync"
command that all unacknowledged outstanding requests have been dropped.  The
initiator then sends the next command with the OOB (out of band) bit set,
and with the OOB offset pointing to the beginning of the iSCSI header.  The
target, after sending the "Out of Sync" command, should ignore every thing
on that connection and wait for the OOB data to re-sync again.  This
exchange could also work if sent from the initiator to the target.

Section 2.2.2: Ordering and iSCSI numbering
"The initiator and target are assumed to have three registers that define
the allocation mechanism - CmdRN, ExpCmdRN, MaxCmdRN....  The target and
initiator registers are supposed to uphold causal ordering." This indicates
that these registers guarantee ordered delivery of commands among multiple
TCP connections in a session.  However, the spec continues in the following
paragraphs with the following statement which seems to negate this: "iSCSI
targets are not required to use the numbering scheme for ordered delivery
even when they support multiple connections."
We believe that the draft should clarify the issue by either: 1) stating
that in-order delivery MUST be guaranteed, or 2)  limiting the use of the
above three registers only to indicate the command queue depth of the
target, and remove any mention of their role for in-order delivery.

Section 2.2.6:
What is the view used for?  Is it's intent so different customers get
different LUNs views (like FC zoning)? 

Section: 3.1.1
The draft suggests that the payload to the SCSI response (opcode 0x41) could
contain response data and sense data together. However, the description is
ambiguous in terms of explaining how to interpret any such data that follows
the header - for instance if it contains both, which one comes first, could
there be a gap between them, are there alignment issues etc.  Should a
target's response or sense data be included in a SCSI data response buffer?
In any case, its seems more natural that an adapter would provide these
separately ( response data for inquiry, sense for check condition etc ).  
So, in the interests of simplicity we suggest limiting the scsi response
messages to contain EITHER response data, OR sense data, but not both. We
know which because either the res_len or the sense_len will have a non zero
value.

Section: 3.10
There should a be a timeout interval for getting in all the data buffers
associated with a write and a read.  Incomplete data may hang around
indefinitely if the TCP session stays open.
We propose that the target needs to have a timeout for WRITE, and the
initiator needs to have a timeout for READ.  The spec should specify the
length of the timeout and what the recovery action is. One suggestion is a
timeout in the order of seconds, with the recovery action of closing the TCP
connection, starting a new connection, and re-issuing the command.  The time
out parameter could be passed in the Login Parameters field.

Section: 2.2.4
It states in section 2.2.4 that "an initiator may request, at login, to send
immediate data of any size and a target may indicate the size of immediate
data blocks it is ready to accept in its response."  But the draft is
ambiguous as to how this size negotiation gets done.  Is it a text command
format?  If so what is the syntax?
We propose: "ImmediateData: MaxSize"   where MaxSize is the maximum size of
immediate data the target is willing to accept in bytes.

Section: 3.3
The SCSI Response message defines a field 'iSCSI Status'. The value 1 means
'iSCSI Check'. In some error cases the command is being rejected without
being sent to the scsi layer, because the iscsi target layer found a
problem. Section 3.3.4 of the draft states: "if the iscsi field is not 0 the
command status will indicate CHECK CONDITION".  But the Command Status field
is supposed to be the SCSI status (section 3.3.3). If the command never went
to the SCSI layer, then a specific value should not put in that field.

We propose in these cases to change iscsi_status to the condition (more
values than just "good" and "iscsi check" i.e. not_logged_on,
data_in_timeout, buffer_address_inconsistency), and command status to 0.
This informs the initiator the iscsi target layer had a problem with this
command and never passed it on to the SCSI layer, allowing an unambiguous
separation of SCSI-related error conditions from iSCSI ones.

Section: 3.11.3
This section states that if the key is not recognized, it should be ignored.

How should malformed Text commands be handled?  
It can be handled by: 1) ignore the command, 2) Send a response with no
keys, 3)  Close down the session, 4) error response (however this may
provide information to an attacker)

Section: 3.13.4
the login "parameters passed for a clear-text password authentication are:
Initiator:<domain-name>[/modifier]
Target:<domain-name>[/modifier]
Authenticator:open-sesame
Access-Id:value"
Access-Id is ambiguous?  How does it differ from the Initiator?  It is not
described in Appendix B.

Sections: 3.14.2, 3.11.3, and 3.12.3
A Text Command is sent because the Login Response indicated "additional
authentication required."  The login can now complete, and the target should
send a Login Response indicating "accept login." Does the target have to
send a Text Response in addition to the Login Response?  The text response
description makes it sound like all text keys must be sent back to the
initiator, if they are accepted. This would only be the case for certain
keys (such as UseRTT), correct?

Section: 3.17
The map command is very ambiguous and needs further definition.  Does the
iSCSI initiator or target issue this command?

Sections 3.6-3.8
When an iSCSI target receives a Task Management command specifying "Target
Reset", the iSCSI target may send an Asynchronous Event to any initiators
connected to the target device. 
What is the intended session shutdown sequence?  Does each initiator wait
for the target to close the TCP connection?  Does each initiator immediately
close its TCP connection?  Are the initiators expected to wait for a certain
amount of time prior to re-opening their TCP connections?  This area needs
more work.


General Proposal:
We propose that a flag be added to the SCSI Data PDU to indicate that a PDU
is the last one for the "current data transfer".  This is most useful when
the current data transfer is the data being sent in response to an RTT:  The
target knows not to expect any more data for that RTT when it sees the bit
set.  This simplifies the termination of the data transfer, particularly if
overlapping data requests are sent or the initiator fails to send all of the
requested data.  The bit might also be useful for data transfers to the
initiator in response to a SCSI command, where the data transfer can be
terminated and checked without having to watch for a SCSI Response packet.
Finally, it may belong in the SCSI Command PDU (for immediate data) and
perhaps the SCSI Response PDU (if non-sense data can show up there) for
consistency of implementation.
Prev by Date: RE: Avoiding deadlock in iSCSI
Next by Date: Re: FCIP and wandering duplicates
Prev by thread: RE: patent question, Trademark too.
Next by thread: RE: Comments on the Draft
Index(es):
- Date
- Thread
Home
Last updated: Tue Sep 04 01:07:20 2001
6315 messages in chronological order