|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Choice of ESP alg. for IPS/IPSec - 3DES-CBC vs. 3DES-CBC-IIn our analysis of algorithms, we have been constrained by the transforms existing or under development by IPsec WG. In general, the IPsec WG takes it lead from NIST/ANSI, which looks not only at performance and implementability in hardware and software, but also security and intellectual property issues. To reference a given algorithm in the IPS security draft, we need to be able to reference an IPsec WG transform document, and that in turn tends to be dependent on standardization of the algorithms and modes in question. Thus, the algorithms that are specified in the draft all correspond to previous or current algorithms under consideration by NIST, for which IPsec transform documents exist or are currently under development. You will notice that these algorithms are not necessarily the optimal ones (at least judged by software performance metrics). For example, AES-OCB has considerably lower cycles per-byte cost in software than AES-CTR mode + CBC-MAC with XCBC extensions, though I'm told that they're roughly equivalent in hardware. Another example of non-optimal algorithm selection occurs in the authentication algorithms, where we have HMAC-SHA1 as a MUST and CBC-MAC with XCBC extensions as a SHOULD. As I am sure you are aware, authentication algorithms such as PMAC are MUCH more efficient to implement in hardware, and algorithms such as UMAC are MUCH more efficient to implement in software than the algorithms that we chose. The problem was that our understanding was that neither PMAC nor UMAC was far enough along in the NIST process. Ultimately, the algorithms that end up in the final security document will largely be gated by what IPsec transform documents can be standardized in the necessary timeframe. We chose 3DES-CBC and HMAC-SHA1 as MUST implement because they were already widely implemented and IPsec transform documents exist which we can reference, although the performance of both algorithms is less than ideal for 1+ Gbps operation. The argument was that everyone could at least implement these algorithms, warts and all. Given that 3DES-CBC-I has already been standardized by ANSI, it may be feasible to get an IPsec transform document written and adopted as a work item by IPsec WG. If this can happen, then it would be possible to argue the merits of this algorithm versus the other ones under consideration. Given the prevalence of 3DES-CBC however, I suspect that the argument would be over whether 3DES-CBC-I would become a MAY or a SHOULD implement, rather than a MUST. >From: "Mukund, Shridhar" <Shridhar_Mukund@adaptec.com> >To: ips@ece.cmu.edu >CC: "Mukund, Shridhar" <Shridhar_Mukund@adaptec.com> >Subject: Choice of ESP alg. for IPS/IPSec - 3DES-CBC vs. 3DES-CBC-I >Date: Fri, 30 Nov 2001 18:15:15 -0800 > > >Hello, > > Re: Choice of ESP alg. in >http://www.ietf.org/internet-drafts/draft-ietf-ips-security-06.txt > > Question: > As noted, we need an algorithm implementable in hardware at speeds >of >up > to 10Gbps, as well as being efficient for implementation in >software >at speeds > of 100Mbps or slower. AES-CTR is an excellent solution. But then it >will take time to > get approved and further time to get "time tested" before being >adopted. Even after > adotion of AES-CTR, 3DES-CBC will need to co-exist for many years >to >come. > > 3DES-CBC does not gracefully scale to 10Gbps for two reasons: > 1. Frequent rekeying at 10Gbps: This issue is discussed in depth in >the draft. > Although very inconvenient, state-of-art IKE stacks (esp. when >running on off-load > processor) can deal with it. > 2. Lack of pipeline-ability: The feedback loop dictated by CBC >prohibits pipelined > high-speed VLSI implementation of the 3DES-CBC engine. > > The ANSI standard X9.52-1998 which specifies 3DES-CBC(TCBC) also >specifies > an equally standard variant called TCBC-I(say 3DES-CBC-Interleaved) >with same > security properties. The effort required to enhance existing >software >and VLSI > implementations of 3DES-CBC to 3DES-CBC-I is "minor". 3DES-CBC can >be >realized > simply thru' a degenerate usage of the 3DES-CBC-I module. On the >positive side, it > brings "substantial" savings in multi-gig VLSI implementation. > Was the candidate ESP algorithm 3DES-CBC-I (superset of 3DES-CBC) >considered > for the SHOULD implement option? Eventually something like AES-CTR >will pervade, > but for the interm this is indeed a low-cost option to get to >speeds >up to 10Gbps. > > Comments on the VLSI implementation: > A 3DES(not 3DES-CBC) engine by itself is highly pipeline-able and >can >pump 10Gbps > even on an FPGA. However for 3DES-CBC, one has to wait for 3DES to >be >completed > on a given 64-bit symbol before commencing 3DES on the next symbol. >As a result, > a "single" 3DES-CBC engine max throughput is somewhere above 1Gbps, >depending > on the process technology. > > As usual, there is a brute-force solution to the problem which >requires use of > multiple 3DES-CBC units. These engines take up significant silicon >real estate. The > implementation complexity is not just due to the multiplicity of >3DES-CBC units but > more so due to all the "incidental" kitchen-sinks and bath-tubs >that >get thrown into > the cauldron to support the multiplicity: scheduler, buffers per >engine(think jumbo frames), > keeping track of contexts (10Gbps traffic could all belong to the >one >connection or > multiple connections), latency, power, ... > > 3DES-CBC-I partitions the symbol stream into three sub-streams so >that a single > engine with three pipeline stages can pump 3X throughput and hence >bring about a > 3X reduction in the kitchen-sink count and complexity. > > Further more: At the time 3DES-CBC-I was conceived multi-gig >throughput at the > network end-point was probably not anticipated(my guess). As a >result, they stopped > at tri-partitioning or 3-levels of interleaving(my guess). After >all >it is only the IP Storage > application that is pioneering multi-gig IPSec throughput at the >end >point. If we used > 8-levels of interleaving we can pump all 10Gbps of throughput >through >a single engine > using current process technologies. No kitchen-sinks, no bath-tubs! > >Thoughts, Comments, Concerns ? > >-Shridhar Mukund > > _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
Home Last updated: Mon Dec 03 11:17:39 2001 7981 messages in chronological order |