|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
iSCSI Data Integrity - Digests
The following describes three alternatives for iSCSI data
integrity.
Mike Krause
HP
Requirements:
- Use existing / proven CRC algorithms and techniques to provide fast
market enablement while avoiding a "reinvention of the wheel"
exercise.
- Provide strong end-to-end data integrity for both iSCSI PDU header
and data payload
- CRC is required for all implementations, i.e. strong end-to-end
data integrity is not an option as customers will not adopt solutions
without such guarantees.
Alternative 1 (Highest preference):
- An iSCSI PDU shall be restricted to a single TCP segment.
Multiple iSCSI PDU may be present within the same TCP segment but none
shall span multiple segments.
- Each iSCSI PDU is protected by a trailing 32-bit CRC (Ethernet
polynomial), i.e. a single CRC covers the entire iSCSI header and data.
Assumption:
- iSCSI PDU header is not modified during transmission. While
there has been some discussion of a desire to provide such capabilities
in the future, there are no current requirements to support requiring the
specification to take this into account at this time. Should this become
a requirement in the future, an intermediate endnode supporting iSCSI
header modification would need to guarantee strong data integrity within
its implementation using any of the well-known / deployed techniques.
Benefits:
- Strong end-to-end data integrity using a well-known, proven
technology.
- Low-cost, high-speed hardware implementations with readily available
hardware cores can be created with minimal design complexity.
- Only one CRC which can be implemented in software mitigating the
performance impacts this iSCSI data integrity would impose.
- Ability to accelerate software iSCSI implementations using a slightly
modified NIC to perform the CRC calculation / verification for both
inbound and outbound data streams. This modified NIC would only
require minor understanding of the iSCSI header, i.e. to identify it and
locate the CRC within the data stream. The CRC can be verified while
coming in off the "wire" or inserted while being placed on the
"wire". This technique is well understood since it
is very similar to what is implemented by TCP checksum off-load
implementations in use today.
- Note: A NIC implementing this functionality could combine the
verification of the TCP checksum into a "one-stop" verification
operation and silently drop invalid packets or tag them as
"bad" for ULP processing.
- Solves the framing problem while eliminating the need for future
support of "chunking" / RDMA technology. Each PDU header
contains sufficient information required for direct data placement
providing the same benefits attributed to chunking / RDMA. This
will also allow simplified "bridge" solutions to be
constructed, e.g. iSCSI-to-InfiniBand, iSCSI-to-SRP, etc.
- Eliminates the need to maintain intermediate CRC results (both
inbound and outbound) reducing implementation cost / complexity.
- Eliminates bandwidth waste by reducing the number of bytes required
to guarantee end-to-end data integrity while supporting multiple small
PDU per segment (compaction)
- Provides improved QoS arbitration control / management - if a PDU
were allowed to span multiple segments, then an implementation would need
to transmit segments back-to-back (or very close) to deliver strong
end-to-end performance / transaction throughput. This may be
implementation-specific but is still a tangible benefit for customers.
- If an intermediate endnode performs re-segmentation, a PDU may be
span multiple segments. This would be detected by a PDU CRC error
providing a simple detection mechanism allowing implementations to
recover either at the connection or session level.
Constraints:
- iSCSI implementations must be able to determine each connection's MSS
and create iSCSI PDU that fit within the MSS. Such functionality is
available in a variety of TCP implementations today and for hardware
implementations.
- For the send-side retransmission problem (i.e. how to delineate
packets within a byte stream), a hardware implementation is
straight-forward to support since it provide the PDU-segment
correlation.
- For a software implementation, the mbuf / mblk encompassing the iSCSI
PDU would be marked to indicate whether the associated buffer should be
sent within a separate segment or not. This is not common to any
TCP implementations to date but is not difficult to implement. It
should also be noted that this is an implementation not a TCP protocol
issue.
- If a layer 4 intermediate endnode glues together two TCP streams and
is not iSCSI aware, the send-side retransmission is a problem.
However, it is unclear whether this usage model must be transparently
supported by iSCSI, i.e. such an intermediate endnode should be required
to be iSCSI aware. This is not unreasonable as most layer 4
intermediate endnodes are providing some value-add service as a function
of layer 4; why wouldn't such an endnode provide iSCSI value-add and thus
be layer 5 aware.
Alternative 2 (middle preference)
- An iSCSI PDU shall be restricted to a single TCP segment.
Multiple iSCSI PDU may be present within the same TCP segment but none
shall span multiple segments.
- Each iSCSI PDU is protected by two CRCs - one invariant and one
variant. The invariant CRC (ICRC) is a 32-bit CRC covering the PDU
data and invariant header fields (e.g. address). The variant CRC
(VCRC) is either a 16 or 32-bit CRC that covers the entire PDU header,
data, and invariant CRC. PDU layout would be: header, data, ICRC,
VCRC.
- Note: This scheme is conceptually the same as what is used in
InfiniBand providing customers and the industry with a single paradigm
and improved technology integration for both compute and storage
endnodes.
Benefits relative to Alternative 1:
- Supports an intermediated endnode updating iSCSI header fields while
supporting strong end-to-end data integrity of all invariant header
fields and data. It is critical that all invariant header fields
such as target address be protected at all times to avoid silent data
corruption / illegal memory access since these fields are used to DMA the
data into / from target memory.
- Note: This problem does not exist in IP-based applications today
since such implementations do not expose addresses across the wire but
use look-up techniques as a function of the header. iSCSI
implementations may choose to use a similar technique but at the cost of
increased resources / complexity.
- Limits the complexity / overhead required to support a separate
header CRC - e.g. intermediate byte-stream CRC injection /
verification. This simplifies the hardware implementation for full
off-load solution as well as provides the ability to create simplified
CRC acceleration as described in alternative 1 for software-based iSCSI
implementations.
- Use of two trailer CRCs does not impact overall end-to-end
performance or endnode hardware resources. Implementations are
gated more by the memory subsystems / cache coherency overheads than by
external wire speed transmission, i.e .the packet will, in general,
arrive before one could complete the first few cache line fetch
operations. As such, given the single-segment operation, the
data can be verified as it comes in off the wire and the memory
operations initiated with minimal latency (most operations will be
pipeline operations within a few cycles).
- An intermediate endnode can provide data integrity checks while data
is in-flight and stomp the CRC should it detect an error. This
allows packet flow-through to be supported while providing fault
isolation and a single for subsequent endnodes to drop invalid packets if
they desire.
Constraints:
- Invariant header fields must be identified and included within the
ICRC calculation adding minor complexity to the overall implementation.
Alternative 3 (least preferred):
- Allow a PDU to span multiple TCP segments.
- Implement two CRC: a header CRC and a data CRC.
- Do not allow intermediate endnodes to modify the iSCSI header.
Constraints / Disadvantages:
- Increased implementation complexity and overhead. The header
CRC must occur following the header requiring injection / removal within
the endnodes. This complexity is compounded for variable header
protocols such as iSCSI and is why such a solution has been rejected in
other high-speed technologies.
- Requires intermediate CRC state to be maintained for both inbound and
outbound requests.
- Increased QoS scheduling complexity for strong end-to-end application
throughput.
- Does not solve the framing problem perhaps necessitates the need for
a chunking / RDMA solution. This increases solution complexity and
creates interoperability / support issues for customers, i.e. options are
bad for developers; bad for customers.
- Severely limits creating high performance iSCSI software-based
implementations perhaps making them impractical as a general purpose
implementation. This will limit the potential market for iSCSI
solutions.
- Note: If an intermediate endnode is allowed to modify the PDU header,
then there exists a possibility of silent data corruption since the
invariant portions no longer have end-to-end data integrity. This
will be a major issue for customers in terms of their ability to adopt
iSCSI across a variety of solution spaces, i.e. if there is the potential
for silent data corruption, then customers will not deploy iSCSI and will
turn to alternatives that provide stronger end-to-end data integrity.
Home
Last updated: Tue Sep 04 01:05:40 2001
6315 messages in chronological order
|