|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: Keep-alive traffic (was iSCSI: more on StatRN)Cheng, It is not as simple as that. StatRN does not provide a deterministic timeout for a connection failure indication. As such, it should not be relied upon to determine a failed connection. Even with keepalive, the TCP timeout will still be too long for useful connection recovery should status be pending. It would be desirable to prevent a reset and overlapping recoveries. As such, for a connection failure detection to be useful, it must be relatively quick. The SCSI layer should know little about the underlying transport so the transport must be pro-active in responding to transport failure. Repetitive probes every 10 seconds could satisfy detection requirements while status is pending. This would also provide the target early notice of a client failure as communication while not idle would be deterministic. Doug > StatRN and keep-alive are intended for detecting and recovering a lost > connection or iSCSI command. My opinion is they are mandatory only if > Internet is a very unreliable connection. In traditional SCSI adapters, a > target device never initiate a recovery but must detect > duplicated commands. > An initiator device always tries its best in detecting an error early and > reissuing the command without resort to big-hammer. Application software > and device driver timeouts are imperative because a SCSI device can die > without warning. StatRN and keep-alive are needed when the frequency of > losing a connection or command is so high that recovery by timeout is > considered undesirable and inefficient. If the error frequency is so low, > then, keep the design simple and stupid by letting device driver > timeout do > its job. When a target device is shared by many initiators, kill > it with a > big-hammer involves everyone sharing the target. For those who debated > forever on hard and soft SCSI resets, they understand the need and > consequence of big-hammer. For an iSCSI adapter, if the frequency of > connection and command loss is high, StatRN and keep-alive are useful in > helping detect the loss early. > > To know why a target never initiates a recovery but must detect duplicated > commands and an initiator must detect an error early without resort to > big-hammer, please make a table of all possible errors and their recovery > actions, then, the conclusion is obvious. > > Y.P. Cheng, CTO, ConnectCom Solutions Corp. > >
Home Last updated: Tue Sep 04 01:06:34 2001 6315 messages in chronological order |