SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    RE: Comments on status responses



    
    
    Very good analysis.  I would like however to point out that althouth that
    sequence of information WAS DESIGNED albeit not by T10 but by the good
    people that designed 360
    channels (status and sense) and 370 (status, sense and extended sense).
    
    We migth end up however having more than 1 bit of iSCSI staus to indicate
    the troubled area.
    
    Julo
    
    Jim McGrath <Jim.McGrath@quantum.com> on 14/09/2000 21:29:14
    
    Please respond to Jim McGrath <Jim.McGrath@quantum.com>
    
    To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
    cc:
    Subject:  RE: Comments on status responses
    
    
    
    
    
    Traditionally error reporting performs two different functions.  The first
    is to direct recovery actions.  The second is to passively log information
    for future examination.
    
    SCSI was never really "designed," so is not pure in this area.  But error
    reporting does come in three pieces of information: STATUS, SENSE KEY,
    SENSE
    CODE.  The last two you traditionally get in response to a REQUEST SENSE
    command, but can also be provided directly via an auto sense function.
    
    As a general rule SENSE CODE information is logged, but not used for real
    time recovery.  The combination of STATUS (i.e. CHECK CONDITION) and SENSE
    KEY is used to drive error recovery.  Some errors are already recovered
    (e.g. Recovered Error key), others cannot be easily recovered (Illegal
    Request or Hardware Error key), others indicate a retry might work OK
    (Media
    Error key).
    
    The lesson here is that you should start out with your expected recovery
    actions, and from that map appropriate STATUS, KEY, and CODES.  Don't do
    something because it sounds reasonable to a human - these are computers we
    are talking about afterall.
    
    The other, related issue is layering.  Error recovery may take place at
    different layers in your stack.  Some information is appropriate for the
    lowest layer - for instance, STATUS can be used by adapter hardware to
    issue
    immediate retries (for things like BUSY) or REQUEST SENSE.  The latter
    allows that layer to provide an Auto sense like appearance to higher
    layers,
    even though the target does not support Auto sense per se.  Similarly, if
    the target contains layers, you may want to get separate information from
    each layer (e.g. target, LUN, sub LUNs, etc...).  The SCSI Controller
    Commands document I think provides some interesting examples of this.
    
    In general, I would advise iSCSI to focus on new and novel recovery
    operations required and new layering concepts.  In those areas where new
    stuff can be identified, ask T10 to extend the STATUS/SENSE structure
    accordingly.  Personally I would try and keep layering as strong as
    possible
    between SCSI and the underlying transport.
    
    Jim
    
    
    -----Original Message-----
    From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
    Sent: Thursday, September 14, 2000 8:23 AM
    To: ips@ece.cmu.edu
    Subject: RE: Comments on status responses
    
    
    
    
    Will add to my list of to consider items.
    
    Thanks,
    Julo
    
    "Merhar, Milan" <mmerhar@Pirus.com> on 14/09/2000 17:12:59
    
    Please respond to "Merhar, Milan" <mmerhar@Pirus.com>
    
    To:   Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu
    cc:
    Subject:  RE: Comments on status responses
    
    
    
    
    Julo,
    
    I agree with you that status returned from a SCSI target
    device needs to be transported exactly, so that existing
    drivers, etc. (which are unaware of the intervening iSCSI
    transport) work as expected. But, I think the question
    here concerns status reported by the iSCSI layer itself.
    
    Basically, my opinion is that if we expect the other end's
    error handler to be something like
    
         if ( errno ) then abort;
    
    it makes little or no difference what we return;
    'OK' = 0 and 'ERR' = 1 works as well as an elaborate list
    of nonzero error codes. But, if there's a chance the other end
    could implement a more robust and/or less disruptive
    error recovery procedure, it makes sense to pass some
    "hints" along, through a broader set of exception codes.
    
    As an example, I suggest one might want to create a
    client-side recovery procedure that handles a storage
    device response timeout differently, depending on whether
    it was caused by a target too busy to process the command
    in a timely fashion, versus a timeout caused by excessive
    network congestion or repeated TCP error retrys.
    
    - milan
    
    
    -----Original Message-----
    From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
    Sent: Wednesday, September 13, 2000 1:57 PM
    To: ips@ece.cmu.edu
    Subject: RE: Comments on the Draft
    
    
    
    
    On 1 - depends on where thing go with asymmetric vs. symmetric.
    
    2 - Explicit error codes - isn't that the sense function?
    
    3 - For the flag - probably yes- but I am not sure.  Status implies it so
    you need it
    only if status is bad - which can be before the end.
    
    Julo
    
    "Hall, Howard" <howard@pirus.com> on 13/09/2000 20:50:10
    
    Please respond to "Hall, Howard" <howard@pirus.com>
    
    To:   Julian Satran/Haifa/IBM@IBMIL
    cc:
    Subject:  RE: Comments on the Draft
    
    
    
    
    Julo,
    
    Thank you very much for your thoughtful response.
    What do you think of some of the other issues (reference the original
    email):
    - whether CmdRN, MaxCmdRN, etc. should be used for in-order delivery of
    commands across multiple TCP connections in a session (Comments for Section
    2.2.2).
    - more explicit error codes
    - the proposed flag described at the end of the email
    
    -Howard
    
    -----Original Message-----
    From: julian_satran@il.ibm.com [mailto:julian_satran@il.ibm.com]
    Sent: Wednesday, September 13, 2000 4:52 AM
    To: Hall, Howard
    Subject: Re: Comments on the Draft
    
    
    
    
    1. General -
    
    Loss of synchronization - we hope to be able one day to get to a better
    solution
    but I would suggest to recommend the implementer to drop the connections as
    it won't be able to parse a reset too, reinstate them and then try a reset
    to distinguish it clearly from other events. If it happens in a target -
    drop - and wait for a reconnect then send an AE to indicate the event and
    wait for reset.
    As for inconsistencies - I've added your proposal to the todo list - thank
    you.
    
    2. The view is a qualifier for the name - it allows a target to a second
    level of refinement about what he things it should show the initiator. The
    first part of the name itself is the first level - but for those that are
    conservative in the use of names they are left with the view.
    
    3. We structured it to follow closely FCP - for bridging reasons. As far as
    I can recall
    Response  comes first followed by sense. We will state it explicitly in the
    document
    thanks.
    
    4.timeouts - yes you are right and I stated it already in an older note
    about bridging. We where thinking about three timers:
    
    - command delivery
    - data delivery
    - status delivery
    
    Those should be set at the session initiation and be known to the initiator
    and target but they don't have to be necessarily sent over the wire (as
    they imply mostly local actions in the endpoints).
    
    5. immediate data length negotiation - will fix the negotiation parameters
    - thanks.
    We think also about limiting it by default to 64k(?)
    
    6. iSCSI check - the reason I choose to make it SCSI check too is to get a
    "hook"
    into ACA behavior. I think that is mandatory  - without it all hell breaks
    loose!
    Both initiator and target could be built the with ACA on check condition.
    We can remove the iSCSI status field and add a set of basic sense bytes
    detailing the error.
    Let us keep talking on this item.
    
    7. The security parts are being rewritten just as we talk and we hope that
    we will all see them
    soon
    
    8. Target reset will be changed to have two types - soft and hard reset.
    
    9. closing connection. Yes you are right it needs more text. And we already
    agreed to add a logout (on a different connection to force connection
    close) and remove the RID from login
    
    Thanks  for your careful reading and hope to see you on row 1 or 2 at the
    next IETF meeting,
    Julo
    
    
    
    "Hall, Howard" <howard@pirus.com> on 13/09/2000 02:12:38
    
    Please respond to "Hall, Howard" <howard@pirus.com>
    
    To:   ips@ece.cmu.edu
    cc:    (bcc: Julian Satran/Haifa/IBM)
    Subject:  Comments on the Draft
    
    
    
    
    Here are some detailed questions, issues, and proposals we have come up
    with
    after reviewing the draft closely.
    
    -Howard
    
    Howard Hall
    Pirus Networks
    www.pirus.com
    
    -----------------------
    
    Section: General
    
    We need to explicitly describe in the draft how to handle loss of iSCSI
    parsing synchronization on a TCP stream.
    Resynchronization can be accomplished by shutting down the TCP connection.
    If it's the command stream that gets reset, iSCSI needs to "remember" an
    error code for this session  (for how long?), so that it can let its client
    know what happened when the command session gets reestablished.
    Similarly, suggestions should be made on how to respond when the other side
    sends messages that don't make sense, i.e.: a target sends bogus RTT
    messages, or repeated "opcode not understood"
    We propose that the draft, in general, should state that initiators and
    targets should 'play safe' - especially avoiding bogus reads and writes.
    Initiators can use the 'escalating big hammer' approach, terminating the
    task, terminating the connection/session, resetting the device.
    
    Finally, what should a target do if it detects inconsistencies in the data
    buffers for a write command e.g.. offsets over end of total transfer size,
    offset overlaps, data portion greater than total transfer length, transfer
    tags which don't correspond with any in progress etc. What should an
    initiator do when it decides that at target is behaving illegally?  For
    example, what if a 512 byte write is sent and then we receive an RTT for
    offset 2048?
    We propose that the draft specifies to abort the write command and send a
    response with the ISCSI status set to 1 ( iscsi check ).   Better yet -
    since we have 8 bits of iSCSI status, why not have more values than just
    "good" and "iscsi check" (e.g. not_logged_on, data_in_timeout,
    buffer_address_inconsistency).  After seeing a couple consecutively it
    should shut down the connection or better yet send the proposed "Out of
    Sync" command described below.
    
    Section: General
    There's no framing of the headers and data on the buffers from TCP. If
    anything goes wrong with the parsing, its difficult if not impossible to
    recover. It only takes one length field to be 'off'. If this happens the
    target will probably generate lots of "Opcode not understood" messages.
    We
    suggest one of two methods: 1) after seeing consecutive "Opcode not
    understood" messages it should shut down the connection if this doesn't
    solve the problem then reset the target, or 2)  When the target finds that
    it is out of sync with the initiator ( on receipt of an "Opcode not
    understood"), it will send a new iSCSI "Out of Sync" command to the
    initiator.  The initiator will assume at the reception of the "Out of Sync"
    command that all unacknowledged outstanding requests have been dropped.
    The
    initiator then sends the next command with the OOB (out of band) bit set,
    and with the OOB offset pointing to the beginning of the iSCSI header.  The
    target, after sending the "Out of Sync" command, should ignore every thing
    on that connection and wait for the OOB data to re-sync again.  This
    exchange could also work if sent from the initiator to the target.
    
    Section 2.2.2: Ordering and iSCSI numbering
    "The initiator and target are assumed to have three registers that define
    the allocation mechanism - CmdRN, ExpCmdRN, MaxCmdRN....  The target and
    initiator registers are supposed to uphold causal ordering." This indicates
    that these registers guarantee ordered delivery of commands among multiple
    TCP connections in a session.  However, the spec continues in the following
    paragraphs with the following statement which seems to negate this: "iSCSI
    targets are not required to use the numbering scheme for ordered delivery
    even when they support multiple connections."
    We believe that the draft should clarify the issue by either: 1) stating
    that in-order delivery MUST be guaranteed, or 2)  limiting the use of the
    above three registers only to indicate the command queue depth of the
    target, and remove any mention of their role for in-order delivery.
    
    Section 2.2.6:
    What is the view used for?  Is it's intent so different customers get
    different LUNs views (like FC zoning)?
    
    Section: 3.1.1
    The draft suggests that the payload to the SCSI response (opcode 0x41)
    could
    contain response data and sense data together. However, the description is
    ambiguous in terms of explaining how to interpret any such data that
    follows
    the header - for instance if it contains both, which one comes first, could
    there be a gap between them, are there alignment issues etc.  Should a
    target's response or sense data be included in a SCSI data response buffer?
    In any case, its seems more natural that an adapter would provide these
    separately ( response data for inquiry, sense for check condition etc ).
    So, in the interests of simplicity we suggest limiting the scsi response
    messages to contain EITHER response data, OR sense data, but not both. We
    know which because either the res_len or the sense_len will have a non zero
    value.
    
    Section: 3.10
    There should a be a timeout interval for getting in all the data buffers
    associated with a write and a read.  Incomplete data may hang around
    indefinitely if the TCP session stays open.
    We propose that the target needs to have a timeout for WRITE, and the
    initiator needs to have a timeout for READ.  The spec should specify the
    length of the timeout and what the recovery action is. One suggestion is a
    timeout in the order of seconds, with the recovery action of closing the
    TCP
    connection, starting a new connection, and re-issuing the command.  The
    time
    out parameter could be passed in the Login Parameters field.
    
    Section: 2.2.4
    It states in section 2.2.4 that "an initiator may request, at login, to
    send
    immediate data of any size and a target may indicate the size of immediate
    data blocks it is ready to accept in its response."  But the draft is
    ambiguous as to how this size negotiation gets done.  Is it a text command
    format?  If so what is the syntax?
    We propose: "ImmediateData: MaxSize"   where MaxSize is the maximum size of
    immediate data the target is willing to accept in bytes.
    
    Section: 3.3
    The SCSI Response message defines a field 'iSCSI Status'. The value 1 means
    'iSCSI Check'. In some error cases the command is being rejected without
    being sent to the scsi layer, because the iscsi target layer found a
    problem. Section 3.3.4 of the draft states: "if the iscsi field is not 0
    the
    command status will indicate CHECK CONDITION".  But the Command Status
    field
    is supposed to be the SCSI status (section 3.3.3). If the command never
    went
    to the SCSI layer, then a specific value should not put in that field.
    
    We propose in these cases to change iscsi_status to the condition (more
    values than just "good" and "iscsi check" i.e. not_logged_on,
    data_in_timeout, buffer_address_inconsistency), and command status to 0.
    This informs the initiator the iscsi target layer had a problem with this
    command and never passed it on to the SCSI layer, allowing an unambiguous
    separation of SCSI-related error conditions from iSCSI ones.
    
    Section: 3.11.3
    This section states that if the key is not recognized, it should be
    ignored.
    
    How should malformed Text commands be handled?
    It can be handled by: 1) ignore the command, 2) Send a response with no
    keys, 3)  Close down the session, 4) error response (however this may
    provide information to an attacker)
    
    Section: 3.13.4
    the login "parameters passed for a clear-text password authentication are:
    Initiator:<domain-name>[/modifier]
    Target:<domain-name>[/modifier]
    Authenticator:open-sesame
    Access-Id:value"
    Access-Id is ambiguous?  How does it differ from the Initiator?  It is not
    described in Appendix B.
    
    Sections: 3.14.2, 3.11.3, and 3.12.3
    A Text Command is sent because the Login Response indicated "additional
    authentication required."  The login can now complete, and the target
    should
    send a Login Response indicating "accept login." Does the target have to
    send a Text Response in addition to the Login Response?  The text response
    description makes it sound like all text keys must be sent back to the
    initiator, if they are accepted. This would only be the case for certain
    keys (such as UseRTT), correct?
    
    Section: 3.17
    The map command is very ambiguous and needs further definition.  Does the
    iSCSI initiator or target issue this command?
    
    Sections 3.6-3.8
    When an iSCSI target receives a Task Management command specifying "Target
    Reset", the iSCSI target may send an Asynchronous Event to any initiators
    connected to the target device.
    What is the intended session shutdown sequence?  Does each initiator wait
    for the target to close the TCP connection?  Does each initiator
    immediately
    close its TCP connection?  Are the initiators expected to wait for a
    certain
    amount of time prior to re-opening their TCP connections?  This area needs
    more work.
    
    
    General Proposal:
    We propose that a flag be added to the SCSI Data PDU to indicate that a PDU
    is the last one for the "current data transfer".  This is most useful when
    the current data transfer is the data being sent in response to an RTT:
    The
    target knows not to expect any more data for that RTT when it sees the bit
    set.  This simplifies the termination of the data transfer, particularly if
    overlapping data requests are sent or the initiator fails to send all of
    the
    requested data.  The bit might also be useful for data transfers to the
    initiator in response to a SCSI command, where the data transfer can be
    terminated and checked without having to watch for a SCSI Response packet.
    Finally, it may belong in the SCSI Command PDU (for immediate data) and
    perhaps the SCSI Response PDU (if non-sense data can show up there) for
    consistency of implementation.
    
    
    
    
    
    
    
    
    
    
    
    
    


Home

Last updated: Tue Sep 04 01:07:16 2001
6315 messages in chronological order