|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] iSCSI : Digest Error recovery causes data corruptionJulian & All, Section 5.5 on digest errors states that an initiator MUST "discard and re-start" a task when it encounters a header or data digest error, provided it can recognize the initiator task tag. I assume the above reference to re-start is to the use of the "retry" bit. (?) If so, there is a possibility of this error recovery mechanism leading to data corruption. The probability is reduced with the removal of partial data recovery (based on DataRN/SN). However, data corruption can still occur as follows : - Last Data PDU on a READ I/O is returned from target to initiator. - Initiator detects a header or data digest error on this last Data PDU and discards the PDU. - Initiator re-starts the task (using the "retry" bit). - Target sends in the Status PDU on the previous instance of the command. (It is not clear from the spec what the initiator does with stale frames that continue to arrive on the previous instance of the I/O. For now, I assume the initiator will, by some mechanism discard such frames.) - When the target receives the "retry" of this command, it thinks it has sent back all the data and so, it only sends back Status for this "retry". - Initiator has no count based checks and so, depends (trusts !) the target with its status, based on which it reports a successful I/O completion to the initiator's SCSI ULP, [indicating no residual count, since target thought it sent all the data]. - SCSI ULP assumes a completed I/O and notifies application, [since it depends on the initiator notifying it with an appropriate service response on an underflow, which the initiator in this case did not detect]. - Application encounters data corruption, due to the missing Data PDU which was discarded by the initiator on a digest error, and which was never re-sent by the target, since it does partial recovery by only sending the status. The StatSN based partial status recovery can lead to such dangerous corner cases causing possible data corruption scenarios. Regards, Santosh begin:vcard n:Rao;Santosh tel;work:408-447-3751 x-mozilla-html:FALSE org:Hewlett Packard, Cupertino.;SISL adr:;;19420, Homestead Road, M\S 43LN, ;Cupertino.;CA.;95014.;USA. version:2.1 email;internet:santoshr@cup.hp.com title:Software Design Engineer x-mozilla-cpt:;21088 fn:Santosh Rao end:vcard
Home Last updated: Tue Sep 04 01:05:38 2001 6315 messages in chronological order |