SORT BY:

LIST ORDER
THREAD
AUTHOR
SUBJECT


SEARCH

IPS HOME


    [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

    Re: TCP Framing (considered helpful?)



    John,
    
    Two comments.  I may have misunderstood what you're saying since
    your initial comments imply framing=warp/RDMA/messageboundary,
    and towards the end, you seem to imply framing=markers.
    
    - It's true that implementing iSCSI-level markers does not require
      modifying TCP/IP stacks.  But using the markers (I mean to use it
      to effectively steer) requires TCP/IP changes!  It is a separate
      matter that it is offloaded onto the NIC. 
    
    - You seem to implicitly suggest that WARP (when it becomes available)
      must go into host software stacks to be useful, and hence cannot be 
      done in the right timeframe for iSCSI.  Since one could envision 
      offloading WARP onto the NIC as well, I assume you're hinting 
    	a) either at interoperability issues between software 
               and hardware iSCSI implementations relying on WARP,
            b) or at interoperability issues between hardware and 
               hardware implementations.
    
      iSCSI currently mandates a "no sync and steering" mode which 
      ensures interoperability in either case.  Perhaps you were really 
      concerned about performance in case (b) then?
    --
    Mallikarjun 
    
    
    Mallikarjun Chadalapaka
    Networked Storage Architecture
    Network Storage Solutions Organization
    MS 5668	Hewlett-Packard, Roseville.
    cbm@rose.hp.com
    
    
    >Replies in text below (between [Huff] and  [/Huff]  ).
    >
    >.
    >.
    >.
    >John L. Hufferd
    >Senior Technical Staff Member (STSM)
    >IBM/SSG San Jose Ca
    >(408) 256-0403, Tie: 276-0403,  eFax: (408) 904-4688
    >Internet address: hufferd@us.ibm.com
    >
    >
    >Stephen Bailey <steph@cs.uchicago.edu>@ece.cmu.edu on 05/21/2001 06:50:41
    >AM
    >
    >Sent by:  owner-ips@ece.cmu.edu
    >
    >
    >To:   ips@ece.cmu.edu
    >cc:
    >Subject:  Re: TCP Framing (considered helpful?)
    >
    >
    >
    >John,
    >
    >> I think we must depend on Markers to insure that everything can operate
    >at
    >> top speed, and at the lowest cost.
    >
    >A key question is whether markers actually ensure that everything
    >operates at `top speed, and at the lowest cost'.
    >
    >Matt thinks so.  I (and, presumably those who wrote the framing
    >document) think not.
    >
    >[Huff] I do not think you can say that.  I also support framing (Warp), as
    >a much more elegant solution, but I find it inappropriate to depend on it
    >actually happening, and being made available in the various OSs in the time
    >frame we need.  I do believe, that over time it will be made available, and
    >is a better approach for all TCP/IP applications that can use it. [/Huff]
    >
    >My issue is not even with `lowest cost'.  I don't believe markers will
    >allow you to run at top speed.  Specifically:
    >  1) I doubt the feasibility of implementing the control required for
    >     an eddy buffer (where you store data you can't place) at 10G.
    >     Admittedly, the validity of this claim can't really be assessed
    >     without actually working the implementation, so for 99% of the
    >     list participants (myself included) this is a `yes it is, no it
    >     isn't' point.
    >
    >   [Huff] I believe this has had much more work done on it then you
    >      think.  I have personally stepped through the proposals from
    >      several vendors that are working on this option for their HW HBAs.
    >      Usually, because of the iSCSI PDU headers, the data/commands
    >      can be placed directly into the SCSI Host buffers, almost every
    >      time. Only when the PDU headers arrive slightly out of order
    >      (do to normal routing) are the packets unable to be placed
    >      directly into the Host buffer.  And that requires some, but only
    >      a small amount, of buffering space.
    >      It is the packet drops that occur on PDU headers, and resultant
    >      error retries, that cause the need for large amounts of "on
    >      HBA/chip" buffering.
    >      So by using Markers, these HW iSCSI HBAs can limit the amount of
    >      buffering on the chip/HBAs. [/Huff]
    >
    >  2) an eddy buffer solution requires some substantial speed-up in
    >     both the NIC data path, and MOST IMPORTANTLY: the host bus.  In
    >     order to unload the eddy buffer while still handling incoming
    >     traffic at line rate, clearly the host bus bandwidth must be >
    >     line rate.
    >
    >     [Huff] This is not an effect of an eddy buffer solution, it is a
    >     fact that every TCP/IP NIC has to deal with.  Especially at the
    >     new Speeds.  Our current PCI buss will not support 10 Gigabit, further
    >     PCI-X will not support it either, even PCI-DDR does not fully support
    >     the full data rate.  So it needs to rely on the TCP/IP window
    >     management.  The only other thing you can do is drop the packets.
    >     this clearly makes the problem worse. [/Huff]
    >
    >
    >I know of at least one general purpose framed solution operating at
    >10G which has been available for >3 years (SGI's GSN/ST/XIO NIC).  I'm
    >sure there are others.
    >
    >I can't imagine there's any argument that a framed solution would be
    >voted `most likely to run fast and be cheap'.  Every storage network
    >and cluster interconnect has been designed that way since antiquity.
    >
    >The key tradeoff involves the OS vendors, and I'm wondering why we're
    >speaking for them.  The question IS, how much more work is it to
    >introduce TCP framing over and above what is required to insert iSCSI
    >into their network framework.  My experience from writing NIC and
    >storage drivers for many commercial UNIX-family OSes is:
    >  1) it's an easy and well defined process to insert a new SCSI
    >     transport driver into the SCSI stack.
    >  2) it's hard and poorly defined process to insert ANYTHING into the
    >     network stack.
    >[Huff] I think you are making my point.  This is the problem with SW
    >Stacks.  That is why I believe that it will take a very long time for
    >the various vendors to include such changes into their "bet you business"
    >TCP/IP SW Stacks.  The point that Matt and I have been trying to make
    >is that most OS vendors are NOT creating the iSCSI HW HBAs (NICs).
    >These iSCSI HW HBAs (NICs) have the TCP/IP completely on the HBA, and
    >they have added the iSCSI processing also so that they can steer the
    >packets directly into the approprate SCSI Host buffers. Adding either
    >Markers or Framing into the iSCSI HW HBAs is not a big problem.  It is
    >only a problem of getting Framing (timely) into Host TCP/IP Stacks.
    >[/Huff]
    >
    >Networking has historically been a user-mode activity.  Architected
    >services are only provided to user mode programs.  Kernel clients have
    >been few and far between and so are handled on a case-by-case basis.
    >For example NFS.  Every OS has hacks to make NFS run fast, but they
    >are not stable interfaces for general purpose use.
    >
    >Even Solaris' SysV-derived STREAMS stack, which is intended precisely
    >to provide flexible, crisp interfaces for kernel network clients, does
    >not document the relevant (IP stack) intermodule interfaces.
    >
    >I know that there are more and more kernel network clients, but they
    >are coming either on fluid platforms (e.g. linux), in which case the
    >argument of `it'll take too long to get OS support' doesn't apply, or
    >they are vendor-supplied, in which case a performance iSCSI solution
    >in ANY form may take a while, and the choice of framing or markers
    >isn't going to make a difference.
    >
    >[Huff] I think you are saying something I agree with and something I
    >do not agree with.  That is, that software changes to TCP/IP in the
    >various "Bet you Business" OSs, will take some time.  However, it is
    >not true that new iSCSI device drivers will take very long.  Two types
    >are being created today.  By Cisco, IBM, Intel, etc. These types are
    >iSCSI DD that make calls to normal TCP/IP stacks, and the DD that
    >are being written by the iSCSI HW HBA vendors.  These do not require
    >the OS vendor to do anything special.  This is happening NOW,
    >(Check with CISCO, Intel, and IBM (me?)).  The last thing we want
    >is to depend on a TCP/IP change to get in the
    >way of our momentum. [/Huff]
    >
    >I can't say squat about the architecture of Winsock, but the fact that
    >there is a Microsoft author of the framing proposal who seems very
    >serious about supporting framing and RDMA as quickly as possible
    >suggests that framing support should be available on Windows very
    >soon.
    >
    >[Huff] My following statements are not meant as a negative of Microsoft.
    >However, they and all producers of Key complicated new Software do
    >not quickly bring these to the general market in a way that is as
    >pleasing to HW vendors as HW vendors would like.
    >
    >I believe that Microsoft's heart is in the right place on this issue,
    >and that they will do the right thing with framing, over time.
    >But it is not clear in what release that will be shipped, nor what support
    >pack it will be included.  Also it is not clear how the support
    >will be handled for current Win2k, WinNT etc.
    >
    >This is why I think we should have Framing a Must implement
    >and an Optional to use.  It is the easiest thing for SW to
    >create, and brings the needed cost reduction to iSCSI HW and
    >it is completely under our (iSCSI protocol) control.
    >[/Huff]
    >
    >
    >
    >Steph
    >
    >
    >
    >
    
    
    


Home

Last updated: Tue Sep 04 01:04:38 2001
6315 messages in chronological order