|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: Requirements specificationDoug, Again the current protocol allows you to do what you want - e.g., build a "virtual target"/LU if this is the way you think things will scale. The paradox of decreased performance per drive due to the increase in recording density is not lost to the storage industry and the major techniques through which it attempts to mitigate it are caching and striping. The numbers you quote are pure drive numbers. For the drive-to-controller cache you might use a "lightweight iSCSI" (software only) or some other mechanism. From controller to host - once you use one of the boosting techniques (caching, stripping) you will need fast channels. The protocol looks is very simple (multiplexing LU is just another field). You can use it also with a initiator-LU scheme but if we settle for this design we can't use it in larger controllers. Regards, Julo "Douglas Otis" <dotis@sanlight.net> on 07/08/2000 19:14:30 Please respond to "Douglas Otis" <dotis@sanlight.net> To: Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu cc: Subject: RE: Requirements specification Julo, As your architecture is based on using a controller to aggregate data rather than a switch, you are making choices detrimental to an architecture that brings the interface closer to the device. This is reflected in choices for configuration, authentication and protocol. Your architecture is very close to a Fibre-Channel gateway. Alter solicitation within Fibre-Channel, and there is not a significant difference (Spoofing solicitations within the gateway would be one means). As an example, you require data successfully delivered be retained for possible later solicitation with a controller based error recovery. Two means of delivering data and error recovery is just one example of added complexity due to an inability to scale data handling. Although read-channels and data densities improve at a steady pace, as they have for the past quarter century, mechanics of the drives have not. Today's drives can deliver 320 Mbits/second of data on the outside cylinders. The physical size of the drive in conjunction with the number of heads and disks all have substantial impact in a competitive market with respect to power and cost. The cost/volume trend takes us to a single larger disk which paradoxically increases access time as read channel data rates increase. You optimize to take advantage of the burst performance of the read channel with added complexities attempting to time or stage such transfers through your architectural restrictions where the device becomes part of this fabric. Is it logical to design a system where everything is aimed at taking advantage of the high momentary data rate offered by the read channel, or by offering the same throughput using more devices where each interface bandwidth is 'restricted' with respect to these read channel data rates? The advantage of such an approach is found with respect to smaller random traffic. With more devices, redundancy is easily achieved and parallel access offers a means of performance improvement by spreading activity over more devices. In this case, the switch provides bandwidth aggregation and each device would only see their traffic, but the client could see the traffic of hundreds of these devices. Regardless of the nature of the traffic, performance would be more uniform and control could be left at the client. An 8ms access + latency figure in the high cost drives restricts the number of 'independent' operations that average 64k byte to 100 per second or 52 Mbit per second. Forgoing the peak data rate, such an architecture of 'restricted' drives would scale whereas the controller based approach does not and is vulnerable. An independent nexus at the LUN is the only design that offers the required scaling and configuration flexibility. Switch and client aggregation makes sense in cost, performance, capacity, reliability, and scalability. Protocol overhead should be addressed in the protocol itself and not by means of controller aggregation. There are substantial improvements to be made in the protocol area without the use of intervening controllers. Doug -----Original Message----- From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of julian_satran@il.ibm.com Sent: Saturday, August 05, 2000 4:46 AM To: ips@ece.cmu.edu Subject: RE: Requirements specification Doug, The current architecture is good for the whole spectrum. If you are intent on using it for a disk drive you can do so and fill with 0 the fields you are not interested in. You don't have to implement the functions that are intended for controllers. The controller/drive scaling controversy is certainly outside the scope of iSCSI. Regards, Julo "Douglas Otis" <dotis@sanlight.net> on 04/08/2000 19:20:40 Please respond to "Douglas Otis" <dotis@sanlight.net> To: Julian Satran/Haifa/IBM@IBMIL, ips@ece.cmu.edu cc: Subject: RE: Requirements specification Julo, You comments are based on several assumptions reflecting your present architecture. Your implementation is done at the controller rather than a device. You also assume authentication is done at the controller. Each LUN could belong to a different authority and be an independent (virtual) device managed through LDAP. If you bring the interface to the device, you can obtain the required scaling that is otherwise difficult at the controller as with your architecture. By combining everything into a single connection, you do not improve reliability, scalability, availability or fault tolerance. Doug -----Original Message----- From: owner-ips@ece.cmu.edu [mailto:owner-ips@ece.cmu.edu]On Behalf Of julian_satran@il.ibm.com Sent: Thursday, August 03, 2000 7:37 PM To: ips@ece.cmu.edu Subject: Re: Requirements specification David, The one additional requirement is availability/fault-tolerance. Your arguments about performance are valid. However I doubt that there will be enough incentives - beyond price - to develop things for high end controllers and servers. Enabling multiple connections brings those applications the performance required without any serious implications to the rest of the "family" (as I outlined in Pittsburgh controllers and servers that don't need multiple connections/session don't have to implement them). Storage traffic requirements will always exceed those of many other applications. As for the "one-connection-per-LU" we covered this solution in long discussions and even several full fledged implementation - as it is compelingly simple. However the resource consumption is unjustifiably high and the security problems are even worse (the LUs "viewed" by an initiator depend on who he says he is) than in the current draft. Regards, Julo David Robinson <David.Robinson@EBay.Sun.COM> on 04/08/2000 02:43:11 Please respond to David Robinson <David.Robinson@EBay.Sun.COM> To: ips@ece.cmu.edu cc: (bcc: Julian Satran/Haifa/IBM) Subject: Requirements specification To further elaborate on my comments in Pittsburgh on multiple connections per link and connections per LUN vs per target. The current requirements specify that the protocol must support multiple connections per session. So far the only justification for this that I have clearly heard is performance, current and future systems will demand bandwidth that will require aggregation. Is there any other reason for multiple connections? My challenge to this requirement is that it is fundementally a link and transport layer issue that is being exposed to the session layer due to a perception that current link/transport implementations are not adequate to meet perceived demand. The key question here is if this is a "physics" issue that can't be solved with better implementations or just bad implementations? I am leaning towards the latter. I expect that if this protocol is a success, a number of highly tuned adapters using tricks such as hardware assist will be developed. Those doing the development will have direct control over the quality of the implementation. Furthermore, the performance critical environments are likely to be local in nature so preassure to create necessary switches and routers will also exist. The advantages of limiting a single connection per session should be a simplification in the connection management and error handling. From the earliest drafts we have already seen restrictions of individual command/data/status sequences to a single connection to better handle ordering issues. I forsee further restrictions possibly being required to cover handling of lost connections when sequences are received out of across multiple connections. Similarily Steve's comments on security management of multiple connections is of concern. The second area that I brought up was the requirement of one session per initiator target pair instead of one per LUN (i.e. SEP). I am willing to accept the design constraint that a single target must address 10,000 LUNs which can be done with a connection per LUN. However, statements of scaling much higher into the areas where 64K port limitations appear I think is not reasonable. Given the bandwidth available on today's and near future drives that will easily exceed 100MBps I can't imagine designing and deploying storage systems with over 10,000 LUNs but only one network adapter. Even with 10+ Gbps networks this will be a horrible throughput bottleneck that will get worse as storage adapters appear to be gaining bandwidth faster than networks. Therefore requiring greater than 10,000 doesn't seem necessary. >From the performance perspective, a connection per LUN also makes sense. SCSI command flows are already being constrained to a single connection in the current proposal for ordering reasons, so the number of concurrent outstanding requests per LUN is a manageable number. The concurrency desired by multiple connections per session in the existing draft will naturally occur with a connection per LUN. As each TCP connection is a unique flow existing link layer hardware that tries to preserve ordering based on a "flow" (likely IP/port pairs) will give the desired performance properties. Both my objections and the requirements for multiple connections I question above become moot. >From a connection management, command ordering, and error recover perspective things should also get simplier. Ordering is obviously maintained and the sender can now recover from connection errors based on a smaller context and possibly use TCP layer information to determine what responses were received (ACK windows?). To summarize I would like to see the requirements changed to reflect a maximum of 64K LUNs per IP node, require only one transport layer connection per session, and define a session to be an initiator/LUN pair. -David
Home Last updated: Tue Sep 04 01:07:56 2001 6315 messages in chronological order |