SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: iSCSI ERT: data SACK/replay buffer/"semi-transport"



    At 09:12 AM 4/3/2001 -0400, Stephen Bailey wrote:
    > > The Stone and Partridge paper is mostly not applicable to an iSCSI
    > > environment.  The principal failure mechanisms were major software
    > > bugs in the driver stack of PC-oriented machines.
    
    People make mistakes in all implementations.  Examination of other similar 
    packet processing technology for mistakes is applicable to any effort and 
    one should perform a risk assessment as to the probability of the mistakes 
    being repeated here.  The fact that the mistakes were in PC-oriented 
    machines is basically irrelevant and storage is not immune from having 
    similar mistakes (have seen storage implementations that were just as poor 
    in terms of quality as any other segment of the industry).
    
    
    >I'm in complete agreement with Bob.
    >
    >I haven't seen a good analysis of TCP checksum escapes which resulted
    >from intermediary manipulation (I haven't read the papers, but
    >hopefully soon), but my hunch is that it's incredibly rare.
    >
    >An endpoint precipiated TCP checksum `escape' also escape a CRC or any
    >other similar integrity check.  That is why I think all this
    >additional integrity checking (on iSCSI headers & data), is an
    >incredible amount of extra work (not just in computing the CRCs, but
    >also in designing the SACK mechanism and recovery for digest failures)
    >for no real gain.
    
    I agree that some of the recovery is overkill but disagree that error 
    detection is as well.  At a minimum, one needs to have a strong end-to-end 
    error detection mechanism.  Many believe a 16-bit checksum is not adequate 
    to protect their data and given the importance of this data to our 
    customers, most feel the specification must define such a mechanism (with 
    some having strong feelings that this mechanism should NOT be 
    optional).  Now whether we need to have 2 CRCs, etc. is a separate debate 
    but they need to be there and most of us will require that they be used in 
    any product / solution delivered to the customer.
    
    >The real loss is that it's immensely slowing time-to-market for iSCSI 
    >(both in the front end specification and the back end implementation).
    
    A fast TTM solution that is not the highest quality (prevents silent data 
    corruption) will lead to customer distrust and a repeat of the FC adoption 
    rate - only 10 years later has it really started to penetrate customer 
    solutions.
    
    
    >A straw-man proposal (very unpopular given where we are, I know) would
    >be to specify iSCSI without additional integrity checks (other than
    >what you can get through security mechanisms, which is probably not
    >visible to iSCSI anyway), and if that `fails' (I'm sure it won't), we
    >can put an integrity shim between iSCSI and the transport.
    >
    >One example of how to do this would be Julian's TAF.  Another would be
    >the WARP RDMA layer.
    
    If another layer is put in place that provides data integrity, then it is 
    redundant to do this at the iSCSI layer as well and this is one place where 
    an option can be used, i.e. one negotiates the underlying framing mechanism 
    (e.g. WARP) and if it is present, then iSCSI does not activate the CRC 
    services.  If it is not, then it does thereby insuring that there is always 
    end-to-end data integrity present in the solution.
    
    
    >We don't have to specify how to do this now
    
    If this is to be supported then it should be specified now (can be done 
    rather opaquely by just setting a "transport services" attribute for strong 
    end-to-end data integrity protection.
    
    >, and the point is that
    >it's hard to do so, because we really don't know what problem we're
    >solving with it.  We're OK as long as we have a way to address it in
    >the future without completely chucking what already exists.
    >
    >The other point to remember is that iSCSI still has to make the
    >ID->Proposed->Draft->Internet traversal, and anybody that thinks it's
    >going to do that on the first try is kidding themselves.  It's more
    >important to get SOMETHING out there that exposes the implementation
    >holes than to design a cathedral on paper.
    
    Nothing is perfect the first time out but in the tightening economy and 
    increasing customer quality demands from the get-go, the trade-off between 
    quality / reliability and TTM is not something people should rush to 
    make.  The market is not what it used to be where good enough was alright; 
    customers expect more today and with good cause.
    
    Mike
    
    


Home

Last updated: Tue Sep 04 01:05:09 2001
6315 messages in chronological order