|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Performance of iSCSI, FCIP and iFCPHi, we would like to add to the recent discussion on IP storage protocols a performance point of view showing significant differences between the protocols discussed. Any comment would be highly appreciated. Briefly, transporting a set of storage connections between two storage enclaves using multiple TCP sessions (iSCSI and iFCP) provides significantly higher aggregate average throughput than transporting the same set of storage connections over a single TCP session (FCIP), the difference being proportional to the square of the number of TCP sessions. This is due to the way TCP congestion control reacts to packet losses. Here we have two questions: why are we talking about packet losses and why is this difference in throughput. There are several reasons for packet losses: network congestion, link errors and network errors. Network congestion is pervasive in current IP networks, where the only way to control congestion is through dropping packets. Traffic engineering, admission control and bandwidth reservation are currently in early stages of definition. DiffServ supporting QoS infrastructure will not be widely available in the near future (especially when we realize that it is not a simple matter of asserting the EF PHB bit as if it were the old IP ToS; it rather needs network services and supporting SLA negotiation). The DiffServ EF PHB RFC is currently under redefinition. On the other hand, if such supporting QoS infrastructure were indeed available and pervasive, today, why would we need TCP to begin with? Even in a perfectly engineered network, link errors occur. If we take the Fibre Channel objective of 10^-12 Bit Error Rate, for a 10Gb/s link, this is one error every 100 seconds. Network errors also occur with significant frequency in IP networks. Jonathan Stone and Craig Partridge recently reported in Sigcomm 2000 that network errors caught by TCP checksum occur with significant frequency (between one packet in 1100 and 1 in 32000) and without link CRC catching it. For the second question, TCP throughput is impacted by each packet loss. Following TCP's congestion control algorithm existent in all major implementations (Tahoe, Reno, New-Reno, SACK) each packet loss results in the TCP sender's congestion window being reduced to half of its current value, and therefore (assuming constant Round Trip Time), TCP's throughput is halved. After that, the window increases by roughly one packet every two Round Trip Times (assuming the popular Delayed-Acknowledgement algorithm). The temporary decrease in TCP's rate translates into an amount of data missing transmission opportunity. As we show later, for N storage connections sharing an IP "pipe" of rate E, the amount of data missing the opportunity to be transmitted due to a packet loss is D(N) = E^2/(N^2)*RTT^2/(256*M) in the case of iSCSI and iFCP, and D(1) = E^2*RTT^2/(256*M) = D(N)*N^2 in the case of FCIP where RTT = Round Trip Time M = packet size For example, for a set of N=100 connections totaling E=10Gb/s, RTT=10ms, M=1500B, the data not transmitted in time due to a packet loss for iSCSI or iFCP is D(N)=2.6MB. For the same set transported over one TCP session as in FCIP, the data not sent in time is D(1)= 26GB, a 10,000 fold increase. The time interval for TCP to recover its sending rate to its initial value after a packet loss is I(N)= 0.833seconds in the case of iSCSI and iFCP, and I(1)=83.3seconds in the case of FCIP. Observe that in the case of FCIP, the time to recover its rate, I(1)=83.3s, is of the same order of magnitude as the time between two packet losses due exclusively to the link Bit Error Rate. In other words, a packet losse occurs almost immediately after TCP has recovered its rate. This means that FCIP delivers on average about 3/4 of the required 10Gb/s rate, since 1/4 of rate is lost during the time TCP rate increases linearly from 1/2 to full rate. (More precisely, the effective rate is 8.27Gb/s because 1/4 of rate is lost during 83.3s, and the time between two errors is now 120.825s due to decreased sending rate). By comparison, iSCSI or iFCP deliver approximately 9.99979Gb/s (i.e., lost 1/4 of one TCP full rate of 100Mb/s during 0.833s out of a 100s interval). In conclusion, from a performance point of view, transporting storage connections (SCSI or Fibre Channel) on multiple TCP sessions is much more effective than tunneling through a single TCP session, and the difference is proportional to the square of the number of TCP sessions. The math. For a TCP session to sustain a rate of C bits/second, the TCP's congestion window W (measured in number packets) has to be W=RTT*C/(8*M) where RTT = Round Trip Time in seconds M = packet size in Bytes The time needed by the TCP sender to recover from a single packet loss and have its sending rate reach the previous C value is I = 2*RTT*W/2 = RTT*W = RTT^2*C/(8*M) The total amount of data (in Bytes) missing the opportunity to be transmitted in this time interval I is D = C/8*I/4 = C^2*RTT^2/(256*M) If we consider a set of N storage connections sharing an IP "pipe" of rate E, they can be transported in N TCP sessions, as in iSCSI or iFCP. Assuming all connections equal, each TCP session sends at a rate of E/N. One packet loss impacts only one TCP session, and thus, the total amount of data missing the opportunity to be transmitted due to a packet loss is D(N) = E^2/(N^2)*RTT^2/(256*M) On the other hand, if the same set of N storage connections is transported in one TCP session, as in FCIP, the total amount of data losing the opportunity to be transmitted due to a packet loss is D(1) = E^2*RTT^2/(256*M) = D(N)*N^2. For more details on TCP performance see for example: "Modeling TCP Reno Performance: A Simple Model and its Empirical Validation." J. Padhye, V. Firoiu, D. Towsley and J. Kurose, IEEE/ACM Transactions on Networking, April 2000. Franco Travostino, Victor Firoiu ----- Content Internetworking Lab, Technology Center Nortel Networks, Inc. 600 Technology Park Billerica, MA 01821 USA
Home Last updated: Tue Sep 04 01:05:55 2001 6315 messages in chronological order |