|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: iSCSI: Flow ControlAt 07:18 PM 10/14/00 -0700, Matt Wakeley wrote: >Somesh, > >I still don't understand what you are trying to solve. > >With the iSCSI session wide command credit method, there is a portion of the >iSCSI layer that sits right below the SCSI layer. It receives the commands >from the SCSI layer and passes the results of each I/O from each NIC back to >the SCSI layer. The MaxCmdRn indicates how many commands the target (as a >whole) can "buffer". The iSCSI layer will "scatter" the commands to the NICs >until it has used up the MaxCmdRn buffers. Each NIC, once iSCSI has posted a The added advantage is that the policy for this "scatter" can be determined outside of iSCSI and adapt to changing conditions or the attributes of the individual NICs / paths. This allows the actual scatter to be performed by iSCSI with it only understanding the actual scatter algorithm. >Each Target NIC will have a poll of buffers to receive asynchronous (non DATA) >iSCSI messages. As each (small) command message is received, it is placed >into one of these buffers, processed by common iSCSI and the CDB is passed to >the SCSI layer which stores it into its command buffer. The message buffer is >then given back to the NIC for further messages. The buffer management for this is implementation-specific but what is described is one viable alternative. >In your description, the initiator still "scatters" the commands to the NICs, >then the NICs have the burden of trying to figure out if they can send the >command or not. Furthermore, if some NICs have open TCP windows, but don't >have command credit, the command can't be sent. > >In the iSCSI session wide credit model, the initiator will not post commands >to any NIC if it doesn't have credit. Any commands posted to a NIC will be >sent as long as it's TCP window is open. And it can dynamically adjust is scatter algorithm to bypass a connection that is unable to make forward progress without any complexity. It can also use this bypass as a tracking mech for potential problems within the connection itself, e.g. bypass N times indicates the connection may be hung; probe to determine if true and initiate recovery as required. >Having a session wide MaxCmdRn allows the initiator to stop sending SCSI >commands, while still enabling non command messages to be sent. They are >received by each NIC and passed to iSCSI for processing, but since they are >not passed up to SCSI, nothing is overflowed. Correct. > > 2. Have the NICs grab them from a pool through an atomic bus > > transaction. That has got to be tougher to implement than it > > looks, and the bus performance issues due to the need to maintain > > ordering etc? > >As indicated above, each NIC passes the iSCSI messages to a central iSCSI >message processor that sends the appropriate SCSI messages to SCSI. The only "red" flag is the potential scalability issue since this is done through a "central" entity. For a large SMP, central translates into poor scalability. One really would prefer to have this distributed among a set of processors who operate in parallel with minimal critical regions to contend. Problems like this get worse when the ratio of processors to NICs gets too large. As we move towards 10 GbE, this ratio is likely to be fairly large perhaps as high as 4:1 which with a sufficiently high IOP rate can create contention and inefficiency within the endnode. Mike
Home Last updated: Tue Sep 04 01:06:37 2001 6315 messages in chronological order |