|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Some Thoughts on Digestsdraft-ietf-ips-iSCSI-02b.txt states: > CRC32 is effective only for limited data lengths (the probability of > an error going undetected grows linearly with data length). When > using CRC32-2K the digest size increases with data length. I believe this is a bit of an oversimplification. I will elaborate a bit. The absolute probability of an undetected error will of course increase at least linearly with message length because the probability of an error occurring in the first place will increase linearly. This is true also in the case where the message is chopped into 2K segments and a CRC is calculated for each segment. If the probability that a given segment has an undetected error is P, then the probability that one of N segments has an error is approximately N*P. What is the conditional probability that if there is an error it will be undetected? This is more complicated. If the total extent of the error is less than or equal to 32 bits, then the probability it will be undetected by the CRC is zero. Therefore if one assumes that individual error events will be of extent less than or equal to 32 bits, then the only way an undetected error can occur is for at least two independent errors to occur in the same segment. The probability of this occurring will increase approximately linearly with the segment size. I would say this with two caveats, however. First the assumption that individual error events will have extent less than or equal to 32 bits is questionable. And second, no matter how long the segment, the conditional probability that if the segment contains an error, it will be undetected by the CRC will NEVER be more that 2^-32. I would argue that based on this the added complexity of segmenting the message into 2K blocks for CRC computation is not justified and the CRC should NOT be considered ineffective for large blocks of data. (Unless you are prepared to argue that a digest that misses one in 2^32 errors is ineffective.) -------------- CRC Polynomial I would argue that of all the prime polynomials of order 32, the one selected for iSCSI is the WORST one. If a block of data is protected by a CRC-32 and the result is embedded inside another block which is also protected by a CRC-32, then the combined protection is effectively 64 bits. The maximum probability of an undetected error is 2^-64. UNLESS both CRCs use the same polynomial, in which case the combined protection is no better than the protection of only the outer CRC. The proposed CRC of iSCSI uses the same polynomial as the Ethernet CRC. Since iSCSI data will typically be contained inside Ethernet frames, the iSCSI CRC should use a different polynomial than Ethernet. The CRC section of http://www.ietf.org/internet-drafts/draft-dicecco-vitcp-01.txt contains an example of a better CRC. Arguments against using a CRC algorithm other than CCITT include the following: 1. The CCITT CRC-32 has been studied extensively and using a different algorithm would incur unnecessary risk. 2. Using a standard CRC algorithm allows use of existing hardware and software implementations. 3. Referencing an existing CRC algorithm saves work in adequately documenting the algorithm. With respect to #1, I would hope this could be referred to some acknowledged expert in the CRC field. I am sure you will find no studies on the effectiveness of the CRC which depend on any properties of the polynomial not shared by other polynomials such as the one called out in the above example. With respect to #2, it is unlikely that any existing hardware can be used and likely that any iSCSI implementation will require building new ASICs. Designing the polynomial specific section of a high speed CRC unit should take no more than a few days. Having done this, I can speak from experience. For software implementations, the arguments are similar, but the work is a lot less. With respect to #3, since the entire algorithm stays the same except one constant (the polynomial) this should not be too bad. New test vectors would of course need to be generated. --------------- With respect to the HMAC functions, do we need both SHA and MD5? I would expect hardware vendors may choose one or the other to implement in hardware. It would be nice if different suppliers chose the same one. Arguably this is outside the scope of the standard, but recommendations as to the preferred digest algorithms to implement in hardware might result in better interoperability of the resulting products that emerge from the standard.
Home Last updated: Tue Sep 04 01:06:11 2001 6315 messages in chronological order |