RE: Review of the 07 draft

To: "Julian Satran" <Julian_Satran@il.ibm.com>
Subject: RE: Review of the 07 draft
From: "Robert Griswold" <rgriswold@Crossroads.com>
Date: Thu, 9 Aug 2001 10:27:56 -0500
Cc: "Bill Moody" <bmoody@Crossroads.com>, <ips@ece.cmu.edu>
content-class: urn:content-classes:message
Content-Transfer-Encoding: 8bit
Content-Type: text/plain;charset="iso-8859-1"
Sender: owner-ips@ece.cmu.edu
Thread-Index: AcEd0YQEiMImcs8pRNC6GEp7VciaUwArrVcw
Thread-Topic: Review of the 07 draft
Julian:

Thanks, responses in line...

Bob

Robert Griswold
Technologist
Crossroads Systems, Inc.
512-928-7272

From: 	Julian Satran [mailto:Julian_Satran@il.ibm.com] 
Sent:	Sunday, August 05, 2001 9:52 AM
To:	Robert Griswold
Subject:	Re: Review of the 07 draft


Robert,

Thanks for you kind opening and for taking the time to read it.
Comments in text.

Julo

"Robert Griswold" <rgriswold@crossroads.com> on 03-08-2001 22:33:56

To:   Julian Satran/Haifa/IBM@IBMIL
Subject:  Review of the 07 draft



Julian:

First, great work on this initiative and please forgive my huge email.

...

++++++++++

Section 1.2.2.1 (and others):
...

+++ This point has a long history remains of which you can find in the
mail
archives.  I and several other authors where inclined to "silently
discard"
whatever can't be repaired (e.g., bugs in the initiator, delayed packets
or
packets arriving late) and explicitly reject only what can be repaired.
Debugging ease can be achieved by other means (event logs etc.). However
many of our colleagues on the list wanted error rejects to be explicit
whenever it is feasible and that is what you have now except for the
cases
in which discards are due to a "legitimate packet" arriving late (e.g.,
from a previous transmission attempt that is already recovered).
Discards are also better in preventing DOS attacks by stream replay++++

[Bob Griswold]  Thanks, that is good clarification.  Again, while I
realize there may have been a lot of history on some of these comments,
having the evolution of the thinking helps.

Section 1.2.5:
...

+++ We envisioned TTTs to be used by the target to identify parts of the
data stream that arrive in response of an R2T. Additionally they
identify
NOP-Outs that are sent in response to a polling Nop-IN and in general
when
valid indicate that the target expects something and identify the
response
(link it to the "cause") for the target. We mentioned that the protocol
does not care about the value mainly to stress that, unlike ITT, the
protocol has no uniqueness requirement for them and whenever they appear
the LUN is also present and can serve as an additional differentiator.
++++

[Bob Griswold]  This will help in understanding what TTTs will be used
for.


Section 1.2.6
...

+++ Logout has also history. We started without a logout but realized
soon
that we will need a synchronized logout to enable command allegiance
change
for recovery (that is possible only after logout). Besides it is a
simple
and painless way to do a clean shutdown of a connection. As for a target
only logout that is what the Asynch Message does (after some timeouts)
Please keep in mind that there is also a fine interplay between logout
and
recovery and that both initiator and target may decide to start some
recovery action+++

[Bob Griswold]  I see the need for the logout to be as quick and as
clean as possible.  If one assumes the Initiator and Target have agreed
on some form of possible recovery scenario, then having the target spend
cycles trying to deliver some recovery information in response to a
Logout seems reasonable; but I am not confident divergent iSCSI
platforms could come to such an agreement, short of it being defined in
one of the iSCSI specifications.


Section 1.2.7
...

+++ This is getting thrown out (it is already) considering that T10 has
gone a long way to expand the addressing of third party command source
and
destinatiantion +++

[Bob Griswold]  This is good news...



This model assumes that the iSCSI layer will deliver complete PDUs to
underlying layers in single (atomic) operations.  The underlying layer
doe
not need to examine the stream content to discover the PDU boundaries.

But please note that this whole sectionmight go away :-)

[Bob Griswold]  OK.


++++

The rest of this section uses the word MUST a few times in the
description of actions or implementation assumptions.  While the use of
the word MUST may make complete sense to the topic, and the points they
are being applied to, it seems out of place for an optional layer of the
model.  If the Synch and Steering layer is truly optional, then how can
the specification make claims to what MUST happen?

+++ it says that IF you doit that MUST be the way to do it ++++

[Bob Griswold]  My main concern is over the ability to have this
specification define implementations of internal "sync and steering" for
different host implementations.  Certainly the IETF should define the
right method for on-the-wire communication between two systems that are
implementing such layers, wherever that communication may exist, but
internal implementations for iSCSI drivers, and how they handle these
"sync and steering" delivered PDUs seems out of place.  Assuming each OS
needs its separate driver base, separate assumptions and separate
legacies, there will be very little hope for having these PDUs
delivered, internally in divergent operating systems, in the exact same
manner.  Have I made these comments clearer, or just confused it more?


Section 2.2.2
The definition of the AHS (see below, 2.2.3.1) seems to be covered as a
global PDU property, but only op-code 0x01 (SCSI Command) seems to make
use of it.  Since the AHS seems to be only needed to encapsulate an
extended CDB, then why define its possible existence for all PDUs?  Can
the AHS be used in other commands?
+++ There is another one - BiDirectional command fields. +++

[Bob Griswold]  Sorry, emphases on the wrong point here, my fault.  The
use of the AHS is only for iSCSI PDU op-code 0x01, but is defined under
section 2.2 (granted, optional) as a part of the general iSCSI PDU
format.  Since the AHS is only used for a SCSI Command, should it be
defined only under this instance, or is there some future use of the AHS
envisioned?  This comment borderlines on just editorial, and for
education for me.

Section 2.2.2.7
...

+++ I've changed it to:

1.1.1.1   LUN

   Some opcodes operate on a specific Logical Unit. The Logical Unit
Number
   (LUN) field identifies which Logical Unit.  If the opcode does not
   relate to a Logical Unit, this field either is ignored or may be used
in
   an opcode specific way.  The LUN field is 64-bits and it is to be
   formatted in accordance with [SAM2].

 ++++

[Bob Griswold]  Great, that helps a lot.

Section 2.2.3.1
The inclusion of a bit that identifies if the CDB is extended should not
live in an optional header.  Expecting that all implementations would
correctly code the use of an optional header to return critical data
that should be identified with the required portion of the CDB seems out
of place.  We would suggest that the use of a reserved bit or a bit in a
reserved byte be used, from the BHS, be used to identify an extended
CDB.  Putting the spillover of the CDB into the AHS (section 2.3.6) is
not an issue, but flagging if there is spillover should not be in the
AHS.

+++ That must be a "misreading" - We are talking about a coded field one
of
the values is Extended CDB. The whole code is the AHSType.  And if you
have
an extended CDB this header is not optional - it is the way to code the
extended CDB (no other way is decribed)
++++

[Bob Griswold]  I am clear on the fact that the Extended CDB bit is in
the AHSType field; I am assuming that you are referring to section 2.3.6
when you say the header is not optional.  Can a developer assume that if
a SCSI Command PDU is received, and the TotalAHSLength has a correctly
formatted value, that there either must be an Extended CDB AHS, or
Bi-directional AHS present?  This may be sufficient to alert from inside
of the required header that the additional header exists.


Section 2.5.1
...

+++ I can refer to some "opaque" management structure instead of MIB

[Bob Griswold]  Sounds good.

As for not  returning anything this is acceptable for Target Cold Reset
-
where the TCP connections are teared down but not for those that leave
the
connections standing +++

[Bob Griswold]  I have reread this comment, but cannot understand what
you are saying.

Section 2.8.5
...

+++ We think that hex will be the preferred encoding used but sometimes
people will copy decimal numbers. Once you have to have a decimal
conversion routine it makes little sense to force hexadecimal only by
dictate
+++

[Bob Griswold]  Well, if you assume that someone somewhere will have to
do the conversion of decimal to hex, then the burden has to live
somewhere, right?  If the wire is going to be exposed to two different
encoding schemes, then how is a lowly target going to know if its
getting decimal or hex values?  Will there be some marker, key=value
system, or something that will allow the receiver of an ambiguous data
type to know what it is getting?

Section 2.9.5
..

+++ It is not a MUST but it was thought by many (with some vehemence) as
helping matching.  IMHO it has no meaning and would suggest removing it.
+++

[Bob Griswold]  Agreed, there is no reason the same folks couldn't drive
a great "best-practices" RFC, or some SNIA document that could be used
by the industry developers to make life easier.  The out-of-order
key=value issue will never go away, but keeping code efficient is always
good.

Section 2.10.1
...

+++ The text only says that if you are going to support connection
recovery
you should be able to support opening a second connection in order to
clean
the first connection and close it. It also explicitly says that you
don't
have to support 2 connections in full feature phase. And even on 2 TCP
connections it is just a recommendation - using IP-over-pidgeons can be
an
good alternative for forcing a TCP connection to close. On a more
serious
note the second TCP connection can be on the same physical connection.
+++

[Bob Griswold]  Thanks for the explanation, and the great visual.

Section 2.12.4
If the limit of reflected data is 4096 bytes, then the instruction to
the initiator should not be "avoid sending", it should be "MUST NOT"
send more than 4096 bytes.  Is there a reason why the initiator would
want to send more than those number of bytes?

+++ The implications of saying MUST NOT would be that the target has to
check and to reject with a format error. We don't want this do we? +++

[Bob Griswold]  Since the implications of ping data escape me (sorry,
just a storage guy so pinging is something I don't understand) I
probably need to defer this to the networking thinkers.  I just hate to
have language in a specification that allows the receiver of what should
be a field overrun to do whatever it sees fit.  Can you give a two
sentence explanation on what the target would do with ping data that is
truncated, and what the initiator would expect to see in such a case?


Section 2.16
The last sentence in this section seems at odds with good error
detection for targets, and at odds with section 2.19.1.  If a SNACK
received has and error, whether or not it was a miss-directed PDU, or a
transit error, why would a target just silently ignore it?  Should it
use the SNACK Rejected code of the Reject response, or at least tell the
initiator something?  Besides, the use of the word "must" seems
inconsistent with what this sentence is trying to say.  If there is a
good reason for a target to ignore this, then it should be mentioned,
and the directive should be "MUST" if the expectations of the initiator
are set that way.

+++ A SNACK that is not supported is answered either by an Asynch
Message
asking logout - for failover - or total silence - in which case we
assume
SCSI will timeout and do its thing (abort, reissue the command or
whatever
neded.
There no need to answer. A badly formed SNACK is rejected as any othe
badly
formed PDU (although even this is perceived by some as excessive when it
does not help the initiator take repair action).
++++

[Bob Griswold]  Well, I will continue to say that a target that provides
no information on a received packet of information that is part of a
SCSI transfer is not a great target.  If some well-intentioned initiator
keeps sending what it believes to be valid PDUs, but appears to the
target to be invalid, then where and how does the initiator learn about
what is going on at the target level?

Section 2.16.1
...
+++ The target should request logout if it supports connection recovery
but
not
data snack +++

[Bob Griswold]  So is that information (supporting recovery but no data
snack) inclusive in the discovery information?


Section 4.2
...

+++ I think we can leave this to be implementation dependent. If I would
have to chose a "way out" would say limited number of text exchanges
rather
than a timeout.  ++++

[Bob Griswold]  OK, then can that be defined, or should that also be
left for implementation?


Section 7.1c
This description of the use of the Retry bit seems to put the onus of
interpretation of the Retry bit on the target for command replay.  Is it
at all possible for a Replay request to get issued, within the scope of
an initiator delaying its acknowledgement, such that the command
actually did get completed, but the Retry bit was set too soon?  Would
one assume that an initiator just avoids doing something like this?

+++ I will come back to this later. +++

[Bob Griswold]  Great, I am looking forward to it.

Section 7.6
...
+++ It is not defined here - it is assumed that when we let SCSI
commands
dry out (drop data packets, status etc.) the SCSI drivers will recover
after a timeout and do some cleanup +++

[Bob Griswold]  OK, but can ULP still be defined as an acronym for this
document?


Thanks,
Bob

Robert Griswold
Technologist
Crossroads Systems, Inc.
512-928-7272
Prev by Date: iSCSI: DataOrder and DataDeliveryOrder
Next by Date: RE: iSCSI: Support Alias in the protocol
Prev by thread: RE: Review of the 07 draft
Next by thread: RE: Review of the 07 draft
Index(es):
- Date
- Thread
Home
Last updated: Tue Sep 04 01:04:04 2001
6315 messages in chronological order