|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: iSCSI - Change Proposal X bit
Santosh,
There is nothing in a command that arrives late on a link (as in the
example in which it was sent redundantly) to distinguish it from a new
(valid) command.
This wraparound problem exists in all protocols - even in TCP, and we use
the CmdSN per session in the same fashion TCP uses sequence numbers per
connection - and it is solved in different ways (TCP uses time-stamps).
The NOP is meant to solve that wrap-around problem.
I am sure that when rereading the example you will see the issue.
Julo
Santosh Rao <santoshr@cup.hp.com>
Sent by: santoshr@cup.hp.com
24-10-01 18:29
Please respond to Santosh Rao
To: Julian Satran/Haifa/IBM@IBMIL
cc: ips@ece.cmu.edu
Subject: Re: iSCSI - Change Proposal X bit
Julian,
Some comments on the below quoted scenarios :
> session has 3 connections
> on connection 1 I->T c1,c2,c3,C6
> on connection 2 I->T c4,c5,c7,c8
> Target receives 1,2,4,5,7,8 (miss 3 and 6) and acks 1 & 2
> Initiator closes 1 and resends c3, c4, c5,c7,c8 on connection 2 and
6
> on connection 3
> target receives all and starts executing and acks 8 on connection 3
but
> connection 2 stalls after c3 for a LONG TIME
> then (after 2 full sequence wraps) connection 2 is gets alive and
> delivers c4,c5 etc (that are now valid)
When the target acks CmdSN 8 on connection 3, it has, in effect, sent
CmdSN ack's for CMdSNs 3,4,5,6,7,8. This implies that the commands with
CmdSN 3, 4, 5, 7, & 8 were received by the target on connection 2 and
their processing was commenced.
Hence, the following does not make sense :
> connection 2 stalls after c3 for a LONG TIME
> then (after 2 full sequence wraps) connection 2 is gets alive and
> delivers c4,c5 etc (that are now valid)
c4, c5, etc were already delivered to the target and are not being
re-delivered. There is no problem in this case. (??).
Take the next scenario :
> 2 connections:
>
> connection1 I->T c3,c4,c5
> status of 3 contains ack up to 6 and it and all other statuses are
> lost
> connection2 resend c3, c4 & c5 (no logout) and those are executed!
Since the initiator got CmdSN ack's upto 6, the initiator should not be
re-issuing these I/Os ??
I still don't see justification to require that initiators send a
immediate NOP-OUT in the manner being advocated.
On a more fundamental note, I see some issues with the initiator being
allowed to re-issue the commands on a different connection without
having first logged out the previous connection successfully. I see
nothing in the draft that suggests such behaviour, while at the same
time, it is not forbidden.
By resorting to command retries on a different connection in an attempt
to plug the hole, without first logging out the previous connection, the
initiator is susceptible to encountering I/O failure of that I/O due to
ULP timeout.
Here's the scenario why such recovery should not be allowed :
- Initiator sends CmdSN 3 on connection 1.
- No CmdSN updates for a while and initiator re-sends CmdSn 3 on
connection 2.
- At the same time, target has sent CmdSN ack's for CmdSN 3 on
connection 1.
- Initiator has transferred the command allegiance on its side from
connection 1 to connection 2 and is attempting the command on connection
2. However, the command does not go through, since the (ExpCmdSN,
MaxCmdSN) window has advanced and the trget discards the command.
- Target sends in data and/or R2T and/or status for CmdSN 3 on
connection 1. Since the initiator is not expecting any traffic for that
I/O on connection 1, it discards any PDUs received on that connection 1
for which no I/O state existed.
In the above scenario, initiator will never get a CmdSN ack on
connection 2 and will never be able to plug the hole despite repeated
retries, finally, causing a ULP timeout, followed by session recovery.
Given the above scenario, I suggest that the initiator must only
re-issue commands on the same connection, and can re-issue them on
another connection only following a successful logout.
Comments ?
Thanks,
Santosh
Julian Satran wrote:
>
> Santosh,
>
> The scenarios I am talking about are all derivatives of an initiator
trying
> to plug-in holes and switching connections.
> As the initiator does know the "extent" of a hole it can send-out
commands
> that he did not have to.
> I have sent the attached not to Mallikarjun a while ago. I think that
> there might be many of this kind. I am also aware that X bit by itself
> might have some bad scenarios but the new proposal fixes them all.
>
> Julo
>
> _____________________________
>
> Mallikarjun,
>
> Take the following sequence scenario:
>
> session has 3 connections
> on connection 1 I->T c1,c2,c3,C6
> on connection 2 I->T c4,c5,c7,c8
> Target receives 1,2,4,5,7,8 (miss 3 and 6) and acks 1 & 2
> Initiator closes 1 and resends c3, c4, c5,c7,c8 on connection 2 and
6
> on connection 3
> target receives all and starts executing and acks 8 on connection 3
but
> connection 2 stalls after c3 for a LONG TIME
> then (after 2 full sequence wraps) connection 2 is gets alive and
> delivers c4,c5 etc (that are now valid)
>
> That is not a very likely scenario, I admit, but it is possible.
> With X bit I could not find any such scenario since an X either follows
a
> good one on the same connection or can be safely discarded.
> I suspect that there are some more scenarios that involve immediate
> commands or commands that carry their own ack in the status and are
acked
> like:
>
> 2 connections:
>
> connection1 I->T c3,c4,c5
> status of 3 contains ack up to 6 and it and all other statuses are
> lost
> connection2 resend c3, c4 & c5 (no logout) and those are executed!
>
> I think we can avoid those be requiring a NOP exchange before reissuing
a
> command on a new connection or reissue the command with a task
management
> (that has an implied ordering) but why do it if X is an obvious and safe
> solution.
>
> Julo
>
> Regards,
> Julo
>
>
> "Mallikarjun
> C." To: Julian
Satran/Haifa/IBM@IBMIL
> <cbm@rose.hp.c cc:
> om> Subject: Re: iscsi : X bit
in SCSI Command PDU.
>
> 08-10-01 21:45
> Please respond
> to cbm
>
>
>
> Julian,
>
> We currently have the following specified in section 2.2.2.1 -
>
> "The target MUST NOT transmit a MaxCmdSN that is more than
> 2**31 - 1 above the last ExpCmdSN."
>
> It appears to me that the above is sufficient to ward off the
> accidents of the sort you describe. Do you think otherwise?
> --
> Mallikarjun
>
> Mallikarjun Chadalapaka
> Networked Storage Architecture
> Network Storage Solutions Organization
> MS 5668 Hewlett-Packard, Roseville.
> cbm@rose.hp.com
>
> Julian Satran wrote:
> >
> > Mallikarjun,
> >
> > There is at least one theoretical scenario in which an "old" command
> > may appear in a "new window" and be reinstantiated.
> > At 10Gbs and several connection that does not take months. With X the
> > probability is far lower (not 0). I have no other strong arguments
> > but I am still thinking. Matt Wakeley that insisted on it (against
> > me) had some other argument that I am trying to find (I am note
> > remembering).
> >
> > Julo
> >
> > "Mallikarjun C."
> > <cbm@rose.hp.com> To: Julian
> > Satran/Haifa/IBM@IBMIL
> > 08-10-01 20:39 cc:
> > Please respond to cbm Subject: Re: iscsi : X
> > bit in SCSI Command PDU.
> >
> >
> >
> > Julian,
> >
> > Now that you put me on the spot, :-), my response -
> >
> > Santosh argued with me privately that X-bit no longer serves a
> > useful purpose after the advent of task management commands to
> > reassign. My response was that it never was a requirement per se,
> > but always a "courtesy" extended by the initiator to help the
> > target. I also suggested that X-bit may be considered for its
> > usefulness in debugging.
> >
> > He still had some (very reasonable) comments for simplification
> > - the most appealing of which (to me) was the opportunity to do
> > away with the X-bit checking for *every* command PDU that the target
> > has to endure now.
> >
> > If I missed a legitimate use of X-bit, please comment. Do you
> > think it is a protocol requirement per se? I couldn't justify
> > to myself so far (except the Login).
> >
> > Regards.
> > --
> > Mallikarjun
> >
> > Mallikarjun Chadalapaka
> > Networked Storage Architecture
> > Network Storage Solutions Organization
> > MS 5668 Hewlett-Packard, Roseville.
> > cbm@rose.hp.com
> >
> >
> >
> > Julian Satran wrote:
> > >
> > > Santosh,
> > >
> > > I am not sure you went through all scenarios. A conversation with
> > your
> > > colleague - Mallikarjun - and getting through the state table may go
> > a
> > > long way to clarify the need for X.
> > >
> > > And I am sure that by now you found yourself several .
> > >
> > > Julo
> > >
> > > Santosh Rao
> > > <santoshr@cup.hp.com> To: IPS Reflector
> > > Sent by: owner-ips@ece.cmu.edu <ips@ece.cmu.edu>
> > > cc:
> > > 06-10-01 01:56 Subject: iscsi : X
> > > Please respond to Santosh Rao bit in SCSI Command PDU.
> > >
> > >
> > >
> > > All,
> > >
> > > With the elimination of command relay from iscsi [in the interests
> > of
> > > simplification ?], I believe that the X bit in the SCSI Command PDU
> > > can
> > > also be removed. As it exists today, the X bit is only being used
> > for
> > > command restart, which is at attempt by the initiator to plug a
> > > potential hole in the CmdSN sequence at the target. It does this on
> > > failing to get an ExpCmdSN ack for a previously sent command within
> > > some
> > > timeout period.
> > >
> > > Given the above usage of command restart, no X bit is required to be
> > > set
> > > in the SCSI Command PDU when command re-start is done.
> > >
> > > Either :
> > > (a) the target had dropped the command earlier due to a digest
> > error,
> > > in
> > > which case, the command restart plugs the CmdSN hole in the target.
> > >
> > > [OR]
> > >
> > > (b) the target had received the command and was working on it, when
> > > the
> > > initiator timed out too soon and attempted a command restart to plug
> > > [what it thought was] a possible hole in the CmdSN sequence.
> > >
> > > In case (a), no X bit was required, since the target knows nothing
> > of
> > > the original command. In case (b), no X bit is required again, since
> > > the
> > > (ExpCmdSN, MaxCmdSN) window would have advanced and the target can
> > > silently discard the received retry and continue working on the
> > > original
> > > command received.
> > >
> > > Removal of the X bit in the SCSI Command PDU has the following
> > > benefits
> > > :
> > >
> > > a) The CmdSN rules at the target are simplified. No need to look at
> > X
> > > bit, only validate received CmdSN with (ExpCmdSN, MaxCmdSN) window.
> > >
> > > b) The reject reason code "command already in progress" can be
> > > removed.
> > > There's no need for this reject reason code anymore, since X bit
> > > itself
> > > is not required, and the targets can silently discard commands
> > outside
> > > the command window and continue to work on the original instance of
> > > the
> > > command already being processed at the target.
> > >
> > > c) Less work for the target and less resources consumed since it no
> > > longer needs to generate a Reject PDU of type "command in progress".
> > > It
> > > can just silently discard any command PDU outside the (ExpCmdSN,
> > > MaxCmdSN) window.
> > >
> > > d) Less code for the target, since it does not need :
> > > - any Reject code paths when it receives X bit command PDUs that are
> > > already in progress.
> > > - No special casing of CmdSN checking rules.
> > > - No overheads of verifying a received command based on its
> > initiator
> > > task tag, to check if the task is currently active, prior to sending
> > a
> > > Reject response with "command in progress".
> > >
> > > Comments ?
> > >
> > > Thanks,
> > > Santosh
> > >
> > > --
> > > ##################################
> > > Santosh Rao
> > > Software Design Engineer,
> > > HP-UX iSCSI Driver Team,
> > > Hewlett Packard, Cupertino.
> > > email : santoshr@cup.hp.com
> > > Phone : 408-447-3751
> > > ##################################
>
>
> Santosh Rao
> <santoshr@cup. To: IPS Reflector
<ips@ece.cmu.edu>
> hp.com> cc:
> Sent by: Subject: Re: iSCSI - Change
Proposal X bit
> owner-ips@ece.
> cmu.edu
>
>
> 23-10-01 22:50
> Please respond
> to Santosh Rao
>
>
>
> Julian Satran wrote:
> >
> > However in order to drop "old" commands that might in the pipe on a
> > sluggish connection - removing the X bit will require the initiator to
> > issue an immediate NOP requiring a NOP response on every open
connection
> > whenever CmdSN wraps around (becomes equal to InitCmdSN).
>
> Julian,
>
> Can you please explain further the corner case you are describing above
> ? Are you suggesting that special action should be taken every time
> CmdSN wraps around, in case there were holes in the CmdSN sequence at
> the wrap time ? Why is that ?
>
> Here's my understanding of how this plays out :
>
> Rule 1)
> The CmdSN management rules at the target should be handling CmdSN wrap
> case and the initiator cannot issue more than 2^32 -1 commands beyond
> the last ExpCmdSN update it has received from the target, since the
> target MUST NOT transmit a MaxCmdSN that is more than 2**31 - 1 above
> the last ExpCmdSN. (per Section 2.2.2.1)
>
> Rule 2)
> Any holes that occur in the CmdSN sequence are attempted to be plugged
> by the initiator by re-issuing the original command. If the CmdSN never
> got acknowledged and the I/O's ULP timeout expired, the initiator MUST
> perform session recovery. (per Section 8.6)
>
> Thus, going by the above 2 rules, if the CmdSN sequence wraps upto
> ExpCmdSN, the initiator will not be able to issue further commands,
> since the target will keep the CmdSN window closed. The window can only
> re-open when the CmdSN holes are plugged allowing ExpCmdSN and thereby,
> MaxCmdSN to advance. (rule 1 above).
>
> Under the above circumstances, the initiator will possibly try to plug
> the CmdSN hole by re-issuing the original command. It may do this 1 or
> more times before its ULP timeout expires. Either the holes get plugged
> and the windoe re-opens, or ULP timeout occurs without the corresponding
> CmdSN for that I/O having been acknowledged, resulting in session
> logout. (rule 2 above).
>
> What is required over and beyond the above ? Why does removal of X-bit
> require an immediate NOP to be issued every time CmdSN wraps and a hole
> exists in the CmdSN sequence (??).
>
> Regards,
> Santosh
>
> --
> ##################################
> Santosh Rao
> Software Design Engineer,
> HP-UX iSCSI Driver Team,
> Hewlett Packard, Cupertino.
> email : santoshr@cup.hp.com
> Phone : 408-447-3751
> ##################################
--
##################################
Santosh Rao
Software Design Engineer,
HP-UX iSCSI Driver Team,
Hewlett Packard, Cupertino.
email : santoshr@cup.hp.com
Phone : 408-447-3751
##################################
Home Last updated: Thu Oct 25 03:17:45 2001 7380 messages in chronological order |