|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] 06/28 Draft with change barsI have included the Perl script I used to generate the change bars. The draft was generated as follows: perl create-change-bar.pl iSCSI000628.txt iscsi00615.txt > iSCSI000628.2.txt Hope it helps! -Costa Internet-Draft J. Satran <draft-satran-iSCSI-03.txt> D. Smith |Expires December 28, 2000 K. Meth IBM C. Sapuntzakis Cisco Systems | Randy Haagens | Hewlett-Packard Co. | Efri Zeidner | SANGate | Paul Von Stamwitz | Adaptec | Luciano Dalle Ore | Quantum | June 28, 2000 iSCSI (Internet SCSI) Status of this Memo This document is an Internet-Draft and is in full conformance with | all provisions of Section 10 of RFC2026. Internet-Drafts are work- | ing documents of the Internet Engineering Task Force (IETF), its | areas, and its working groups. Note that other groups may also dis- | tribute working documents as Internet-Drafts. Internet-Drafts are | draft documents valid for a maximum of six months and may be | updated, replaced, or obsoleted by other documents at any time. It | is inappropriate to use Internet-Drafts as reference material or to | cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at | http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet- | Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. 0.0.1. Acknowledgements A large group of people contributed through their review, comments and valuable insights to the creation of this document - too many to mention them all. Nevertheless, we are grateful to all of them. We are especially grateful to those that found the time and Satran, Smith, Sapuntzakis, Meth [Page 1] iSCSI June 2000 patience to participate in our weekly phone conferences and inter- mediate meetings in Almaden and Haifa and thus helped shape this document: Matt Wakeley (Agilent), Jim Hafner, John Hufferd, Prasen- jit Sarkar, Meir Toledano, John Dowdy, Steve Legg, Alain Azagury (IBM), Dave Nagle (CMU), David Black (EMC), John Matze (Veritas), Mark Bakke, Steve DeGroote, Mark Shrandt (NuSpeed), Gabi Hecht (Gadzoox), Robert Snively (Brocade), Nelson Nachum (StorAge) Satran, Smith, Sapuntzakis, Meth [Page 2] iSCSI June 2000 Table of Contents 1. Abstract 2. Overview 2.1. SCSI Concepts | 2.2. iSCSI Concepts & Functional Overview 2.3. iSCSI Login 2.4. iSCSI Full Feature Phase 2.5. iSCSI Connection Termination 2.6. Naming 3. Message Formats 3.1. Template Header 3.2. SCSI Command 3.3. SCSI Response 3.4. Asynchronous Event 3.5. SCSI Task Management Message 3.6. SCSI Task Management Response 3.7. Ready To Transfer (RTT) 3.8. SCSI Data 3.9. Text Command 3.10. Text Response 3.11. Login Command 3.12. Login Response 3.13. Ping Command 3.14. Ping Response 3.15. Third Party Commands 3.16. Opcode Not Understood 4. Error Handling iSCSI 5. Notes to Implementors 5.1. Small TCP Segments 5.2. Multiple Network Adapters 5.3. Autosense 5.4. TCP RDMA option 5.5. Data Connections Options 6. Security Considerations 6.1. Data Integrity 6.2. Login Process 6.3. IANA Considerations 7. Authors' Addresses 8. References and Bibliography 9. Appendix A - Examples 9.1. Read operation example 9.2. Write operation example 10. Appendix B - Login/Text keys Satran, Smith, Sapuntzakis, Meth [Page 3] iSCSI June 2000 1. Abstract The Small Computer Systems Interface (SCSI) is a popular family of protocols for communicating with I/O devices, especially storage | devices. This memo describes a transport protocol for SCSI that | operates on top of TCP. The iSCSI protocol aims to be fully com- | pliant with the requirements laid out in the SCSI Architecture | Model - 2 [SAM2] document. 2. Overview 2.1. SCSI Concepts The endpoint of most SCSI commands is a "logical unit" (LUN). Exam- ples of logical units include hard drives, tape drives, CD and DVD drives, printers and processors. Within the logical unit the abstract entity that executes the SCSI commands is named the | device-server. A "target" is a collection of logical units, in | general of the same kind, and is directly addressable on the net- | work. In large installations a target is known also as a "control | unit". The target corresponds to the server in the abstract SAM | client-server model. An "initiator" creates and sends SCSI com- | mands to the target. The initiator corresponds to the client in the | abstract SAM client-server model. A "task" is a linked set of SCSI | commands. Some LUNs support multiple pending (queued) tasks. The | target uses a "task tag" to distinguish between tasks. Only one | command in a task can be outstanding at any given time. A SCSI | command results in an optional data phase and a response phase. In | the data phase, information travels either from the initiator to | the target, as in a WRITE command, or from target to initiator, as | in a READ command. In the response phase, the target returns the | final status of the operation, including any errors. A response | terminates a SCSI command. |2.2. iSCSI Concepts & Functional Overview | The following conceptual layering model is used in this document to | specify Initiator and target actions and how those relate to | transmitted and received Protocol Data Units: - SCSI layer | builds/receives SCSI CDB (Command Data Blocks) and relays/receives | them with the remaining command execute parameters (cf. SAM-2) | to/from the - iSCSI layer that is building/receiving iSCSI PDUs and | relaying/receiving them to/from - one or more TCP connections that | form an initiator-target "session" Communication between initiator and target occurs over one or more TCP connections. The TCP connections are used for sending control | messages, SCSI commands, parameters and data within iSCSI protocol Satran, Smith, Sapuntzakis, Meth [Page 4] iSCSI June 2000 | data units (iSCSI PDU) The group of TCP connections linking an ini- | tiator with a target form a session (loosely equivalent to a SCSI | nexus); a session is defined by a session ID (composed of a initia- | tor part and a target part). TCP connections can be added and | removed from a session. iSCSI supports ordered command delivery | within a session and limited command and data recovery. All SCSI | commands presented to iSCSI get a "command reference number" and | this number can be used by a receiving target for ordered delivery. | A sliding window mechanism is used to limit the number of outstand- | ing commands. For descriptive purposes it is assumed that the | iSCSI layer is implementing the sliding window mechanism. 2.3. iSCSI Login The purpose of iSCSI login is to enable a TCP connection for iSCSI use, authenticate the parties, authorize the initiator to send SCSI commands and mark the connection as belonging to a iSCSI session. A session is used to identify to a target all the connections with | a given initiator. The targets listen on a well-known TCP port for | incoming connections. The initiator begins the login process by | connecting to that well-known TCP port. As part of the login pro- | cess, the initiator and target MAY wish to authenticate each other. | This can occur in many different ways. For example, the endpoints | may wish to check the IP address of the other party. If the TCP | connection uses transport layer security [TLS], certificates may be | used to identify the endpoints. Also, iSCSI includes commands for | identifying the initiator and passing an authenticator to the tar- | get (see Appendix B). Once suitable authentication has occurred, | the target MAY authorize the initiator to send SCSI commands. How | the target chooses to authorize an initiator is beyond the scope of | this document. The target indicates a successful authentication | and authorization by sending a login response with "accept login". The login message includes a session ID - composed with an initia- tor part ISID and a target part TSID. For a new session the TSID is null. As part of the response the target will generate a TSID. Session specific parameters can be specified only for the first login of a session (TSID null)(e.g the maximum number of connec- tions that can be used for this session). Connection specific parameters (if any) can be specified for any login. Thus a session is operational once it has at least one connection and a pending | login can't affect a whole session. After authentication and | authorization, other parameters may be negotiated using the highly | extensible Text Command message that allows arbitrary key:value | pairs to be passed. Any message sent on a TCP connection before | this connection gets into full feature phase at the initiator | should be rejected by the initiator. A message reaching a target | on a TCP connection before the full feature phase will be reject | with an iSCSI check condition bit. Satran, Smith, Sapuntzakis, Meth [Page 5] iSCSI June 2000 2.4. iSCSI Full Feature Phase Once the initiator is authorized to do so, the iSCSI session is in iSCSI full feature phase. The initiator may send SCSI commands and data to the various LUNs on the target by mapping them in iSCSI | messages that go over the established iSCSI session. For SCSI com- | mands that require data and/or parameter transfer, the (optional)data and the status for a command must be sent over the same TCP connection that was used to deliver the SCSI command (con- nection allegiance). Thus if an initiator issues a READ command, the target must send the requested data followed by the status to the initiator over the same TCP connection that was used to deliver the SCSI command. If an initiator issues a WRITE command, the ini- tiator must send the data for that command and the target must return the status over the same TCP connection that was used to | deliver the SCSI command. During iSCSI Full Feature Phase, the | initiator and target may interleave unrelated SCSI commands, their | SCSI Data and responses, over the session. Outgoing SCSI data | (initiator to target - user data or command parameters)is sent as | either unsolicited data or solicited data. Unsolicited data can be | part of an iSCSI command PDU ("immediate data") or an iSCSI data | PDU. Solicited data are sent in response to Ready To Transfer | PDUs. Targets are operating in either solicited (RTT) data mode or | unsolicited (non RTT) data mode. An initiator must always honor an | RTT data request. It is considered an error for an initiator to | send unsolicited data PDUs to a target operating in RTT mode (only | solicited data). By default, immediate data is limited to 64Kbytes | and an initiator is allowed to send immediate data (subject to lim- | itations specified somewhere else in this document)even to targets | working in RTT mode. An initiator may request, at login, to send | immediate data of any size and a target may indicate the size of | immediate data blocks it is ready to accept in its response. A | target is allowed to silently discard data and request retransmis- | sion through RTT. Initiators will not perform any scoreboarding | for data and the residual count calculation is to be performed by | the targets. Incoming data is allways solicited. However an ini- | tiator will be able to request retransmission of all or part of the | target data. SCSI Data packets are matched to their corresponding | SCSI commands by using Tags that are specified in the protocol. | Initiator tags for pending commands are unique initiator-wide for a | session. Target tags for pending commands are unique target-wide | for the session. Although the above mechanisms are designed to | accomplish efficient data delivery and a large degree of control | over the data flow it is recognized that some specific sequences | involving ordered execution and a mix of solicited and immediate | data can result in deadlocks. It is for this reason that discarding | data by a target is considered a legitimate action. Examples of | such sequences are presented in appendix C together recovery Satran, Smith, Sapuntzakis, Meth [Page 6] iSCSI June 2000 | scenarios. Outgoing commands are numbered by iSCSI (CmdRN) and | response PDUs (target to initiator) will continuously update the | initiator about the maximum command number that can be sent(sliding | window). Each iSCSI session to a target is treated as if it ori- | ginated from a different initiator. 2.5. iSCSI Connection Termination Connection termination is assumed to be an exceptional event. Graceful TCP connection shutdowns are done by sending TCP FINs. Graceful connection shutdowns MUST only occur when there are no | outstanding tasks that have allegiance to the connection. A target | SHOULD respond rapidly to a FIN from the initiator by closing its | half of the connection as soon as it has finished all outstanding | tasks that have allegiance to the connection. Closing a connection | that has outstanding tasks may require recovery actions and will Be | described elsewhere in this document. 2.6. Naming Targets are named using an URL type name of the format: scsi://<domain-name>[/modifier] The name used to connect will be optionally included in the login in order to enable the target to present different views. This is the Target Acquired Name (TAN). We will not attempt to define which components of the name will participate in the name resolu- tion process and which ones will be used only for "view" defini- tion. The syntactic sugar included might be used to introduce structure for management purposes but has no specific significance | for this standard. Example: scsi://diskfarm1.acme.com scsi://computingcenter.acme.com/peripherals/diskfarm1 When a target has to act as an initiator for a third party command | it will use the TAN during login as required by the authentication | mechanism. A domain name that contains exactly four numbers | separated by dots (.), where each number is in the range 0 through | 255, will be interpreted as an IPv4 address. Examples: 10.0.0.1/tapefarm1 10.0.0.2 Satran, Smith, Sapuntzakis, Meth [Page 7] iSCSI June 2000 3. Message Formats All multi-byte integers specified in formats defined in this docu- ment are to be represented in network byte order (i.e., big endian). 3.1. Template Header and Opcodes All iSCSI messages and responses have a header of the same length (40 bytes). Additional data may be added, as necessary, beginning with byte 40. The fields of Opcode and Length appear in all message and response headers. The other most commonly used fields are Ini- tiator Task Tag, Logical Unit Number, and Flags, which, when used, always appear in the same location of the header. Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ 0| Opcode | Opcode-specific fields | +---------------+---------------+---------------+---------------+ 4| Length of Data (after 40 byte Header) | +---------------+---------------+---------------+---------------+ 8| LUN or Opcode-specific fields | + + 12| | +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20/ Opcode-specific fields / +/ / +---------------+---------------+---------------+---------------+ 40 3.1.1. Opcode The Opcode indicates which iSCSI type of message or response is encapsulated by the header. Valid opcodes for messages (sent by initiator to target) are: 0x00 Ping Command (from initiator to target) 0x01 SCSI Command (encapsulates a SCSI Command Descriptor Block) 0x02 SCSI Task Management Message 0x03 Login Command 0x04 Text Command 0x05 SCSI Data (for WRITE operation) Valid opcodes for responses (sent by target to initiator) are: Satran, Smith, Sapuntzakis, Meth [Page 8] iSCSI June 2000 0x80 Ping Response (from target to initiator) 0x81 SCSI Response (contains SCSI status and possibly sense information or other response information) 0x82 SCSI Task Management Response 0x83 Login Response 0x84 Text Response 0x85 SCSI Data (for READ operation) 0x86 Ready To Transfer (RTT - sent by target to initiator when it is ready to receive data from initiator) 0x87 Asynchronous Event (sent by target to initiator to indicate certain special conditions) 0x88 Opcode Not Understood 0x89 Open Data Connections Response (optional) 3.1.2. Length The Length field indicates the number of bytes, beyond the first 40 bytes, that are being sent together with this message header. It is anticipated that most iSCSI messages and responses (not counting data transfer messages) will not need more than the 40 byte header, and hence the Length field will contain the value 0. It is expected that larger than 16 byte CDBs and parameter data will fol- low the header. 3.1.3. LUN The LUN specifies the Logical Unit for which the command is tar- geted. If the command does not relate to a Logical Unit, this field is either ignored or may be used for some other purpose. According to [SAM2], a Logical Unit Number can take up to a 64-bit field that identifies the Logical Unit within a target device. The exact format of this field can be found in the [SAM2] document. 3.1.4. Initiator Task Tag The initiator assigns a Task Id (or tag) to each SCSI task that it issues. (Recall that a task is a linked set of SCSI commands.) This Tag is a initiator-wide unique identifier that can be used to uniquely identify the Task. 3.1.5. Opcode-specific fields These field have different meanings for different messages. Satran, Smith, Sapuntzakis, Meth [Page 9] iSCSI June 2000 3.2. SCSI Command Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ | 0| Opcode (0x01) |I|R|A|Rsv|ATTR | CmdRN +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Logical Unit Number (LUN) | + + 12| | +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20| Expected Data Transfer Length | +---------------+---------------+---------------+---------------+ 24| SCSI Command Descriptor Block (CDB) | + + 28| | + + 32| | + + 36| | +---------------+---------------+---------------+---------------+ 40/ Additional Data (Command Dependent) / +/ / +---------------+---------------+---------------+---------------+ |3.2.1. Flags. | The Flags field for a SCSI Command consists of on byte. Byte 2 | b0 (I) Immediate Data from initiator to target | (write/control). | b1 (r) set when data is expected to flow from target to ini- tiator (read). b2 (A) set to turn off Autosense for this command (see [SAM2]). | b3-4 Reserved (should be 0) | b5-7 used to indicate Task Attributes. Autosense refers to the automatic return of sense data to the ini- tiator in case a command did not complete successfully. If autosense is turned off, the initiator must explicitly request that Satran, Smith, Sapuntzakis, Meth [Page 10] iSCSI June 2000 sense data be sent to it after some command has completed with a CHECK CONDITION status. |3.2.2. Task Attributes The Task Attribute field (ATTR) can have one of the following integer values (see [SAM2] for details): 0 Untagged 1 Simple 2 Ordered 3 Head of Queue 4 ACA 3.2.3. Command Reference Number (CmdRN) The Command Reference Number (CmdRN) is provided by the initiator to assist in performing ordered delivery for iSCSI commands. CmdRN is reflecting the value of a 16 bit counter maintained by the ini- tiator and increased by 1 for every command received by the iSCSI delivery mechanism. The counter is set to an initial value at ses- sion initiation (default is 0) and when sending target resets (0). 3.2.4. Expected Data Transfer Length The Expected Data Transfer Length field states the number of bytes of data that the initiator expects will be sent for this (READ or | WRITE) SCSI operation in SCSI Data packets. For a WRITE operation, | the initiator uses this field to specify the number of bytes of | data it expects to transfer for this operation (not counting data | headers). For a READ operation, the initiator uses this field to | specify the number of bytes of data it expects the target to | transfer to the initiator (not counting data headers). If no data | will be transferred in SCSI Data packets for this SCSI operation, | this field should be set to 0. Upon completion of a data transfer, the target will inform the ini- tiator of how many bytes were actually processed (sent or received) by the target. 3.2.5. SCSI Command Descriptor Block (CDB) There are 16 bytes in the CDB field, designed to accommodate the | largest currently defined CDB. If, in the future, larger CDBs are | allowed, the spill-over of the CDB may extend beyond the 40-byte. Satran, Smith, Sapuntzakis, Meth [Page 11] iSCSI June 2000 3.2.6. Command-Data | Some SCSI commands require additional parameter data to accompany the SCSI command. This data may be placed beyond the 40-byte boun- | dary of the iSCSI header. Alternatively user data can be placed in | the the same PDU (in both cases we talk about immediate data). The | Length field is set to the length of this data beyond the 40-byte | header (i.e. includes the CDB extension if present). The CDB | length is: |Length + 16 - I*ExpectedDataTransferLength Satran, Smith, Sapuntzakis, Meth [Page 12] iSCSI June 2000 3.3. SCSI Response Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ | 0| Opcode (0x81) | Rsvd(0) |O|U| MaxCmdRN | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Reserved (0) | + + 12| | +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20| Residual Count | +---------------+---------------+---------------+---------------+ | 24| Command Status|iSCSI Status | StatRN | +---------------+---------------+---------------+---------------+ 28/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 40/ Response or Sense Data (optional) / +/ / +---------------+---------------+---------------+---------------+ 3.3.1. Flags The SCSI Response has its own set of flags, that differs from the flags for a SCSI Command. Byte 2 b0 (U) set for Residual Underflow. In this case, the Resi- dual Count indicates how many bytes were not transferred out of those expected to be transferred. b1 (O) set for Residual Overflow. In this case, the Residual Count indicates how many bytes could not be transferred because the initiator's Expected Data Transfer Length was too small. b2-7 not used (should be set to 0). Bits 0 and 1 are mutually exclusive. |3.3.2. MaxCmdRN | Indicates the maximum CmdRN the initiator should send. It will set | an internal limit register. The initiator will refrain from sending Satran, Smith, Sapuntzakis, Meth [Page 13] iSCSI June 2000 | commands numbered past MaxCmdRN (considering also wrap-around). | Attention should be paid to the fact that response PDUs can arrive | in "wrong order". The internal limit register can only be advanced | by incoming responses (considering also wraparounds). It is | assumed that any target will accept less than 64k outstanding com- | mands. |3.3.3. Residual Count The Residual Count field is valid only in case either the Residual Underflow bit or Residual Overflow bit is set. If neither bit is | set, the Residual Count field should be 0. If the Residual Under- | flow bit is set, the Residual Count indicates how many bytes were | not transferred out of those expected to be transferred. If the | Residual Overflow bit is set, the Residual Count indicates how many | bytes could not be transferred because the initiator's Expected | Data Transfer Length was too small. |3.3.4. Command Status The Command Status field is used to report the SCSI status of the command (as specified in [SAM2]). |3.3.5. iSCSI Status The iSCSI Status field is used to report the status of the command before it was sent by the target to the LUN. The values are given below. 0 Good status 1 iSCSI check If the iSCSI field is not 0 the command status will indicate CHECK | CONDITION |3.3.6. Response or Sense Data If Autosense was not disabled in the originating CDB and the Com- mand Status was CHECK CONDITION (0x02), then the Response Data | field will contain sense data for the failed command. Some sense | codes will relate to iSCSI check conditions (e.g. excessive number | of outstanding commands, immediate data blocks too large etc.). If | the Command Status is Good (0x00) then the Response Data field will | contain data from the data phase of the CDB. The Length parameter | specifies the number of bytes in this field. If no error occurred, | and no data is needed for the response to the SCSI Command the | Length field is 0. Note that if the Command Status was CHECK CON- | DITION but Autosense was disabled, then sense data must be Satran, Smith, Sapuntzakis, Meth [Page 14] iSCSI June 2000 | explicitly requested by the initiator with a new SCSI command. |3.3.7. StatRN - Status Reference Number | StatRN is a reference number that the target iSCSI layer generates | whenever it issues a response by incrementing an internal counter. | A gap in StatRN indicates a lost status (possible due to connection | failure) and be recovered by reissuing the outstanding command with | the original TaskID and CmdRN. Satran, Smith, Sapuntzakis, Meth [Page 15] iSCSI June 2000 3.4. Asynchronous Event An Asynchronous Event may be sent from the target to the initiator without corresponding to a particular command. The target specifies the status for the event and sense data. Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ | 0| Opcode (0x87) | Reserved (0) | MaxCmdRN | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Logical Unit Number (LUN) | + + 12| | +---------------+---------------+---------------+---------------+ 16| Reserved (0) | + + 20| | +---------------+---------------+---------------+---------------+ 24| Command Status|iSCSI Status| Reserved (0) | +---------------+---------------+---------------+---------------+ 28|Event Indicator| Reserved (0) | +---------------+---------------+---------------+---------------+ 32| Reserved (0) | + + 36| | +---------------+---------------+---------------+---------------+ 40/ Sense Data / +/ / +---------------+---------------+---------------+---------------+ 3.4.1. iSCSI Status Some Asynchronous Events are strictly related to iSCSI while others are related to SAM-2. The codes returned for iSCSI Asynchronous Events are: 2 Target is being reset. 3.4.2. Event Indicator The following values are defined. (See [SAM2] for details.) 1 An error condition was encountered after command Satran, Smith, Sapuntzakis, Meth [Page 16] iSCSI June 2000 completion. 2 A newly initialized device is available. 3 Some other type of unit attention condition has occurred. 4 An asynchronous event has occurred. Sense Data accompanying the report identifies the condition. The Length parameter is set to the length of the Sense Data. 3.4.3. MaxCmdRN - inform about this value other initiators after a tar- get Reset Satran, Smith, Sapuntzakis, Meth [Page 17] iSCSI June 2000 3.5. SCSI Task Management Message Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ | 0| Opcode (0x02) | Msg indicator | Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Logical Unit Number (LUN) | + + 12| | +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 40 3.5.1. Msg Indicator The Task Management functions provide an initiator with a way to explicitly control the execution of one or more Tasks. The Task Management functions are summarized as follows (for a more detailed description see the [SAM2] document): 1 Abort Task---aborts the task identified by the Task Tag field. 2 Abort Task Set---aborts all Tasks issued by this initia- | tor on the Logical Unit. .ti -5 3 Clear ACA---clears | the Auto Contingent Allegiance condition. 4 Clear Task Set---Aborts all Tasks (from all initiators) for the Logical Unit. 5 Logical Unit Reset. 6 Target Reset. For the functions above except <Target Reset>, a SCSI Task Manage- ment Response is returned, using the Initiator Task Tag to identify | the operation for which it is responding. For the <Target Reset> | function, the target cancels all pending operations. The target may | send an Asynchronous Event to all attached initiators notifying | them that the target is being reset. The target then closes all of | its TCP connections to all initiators (all sessions are ter- | minated). Satran, Smith, Sapuntzakis, Meth [Page 18] iSCSI June 2000 3.6. SCSI Task Management Response Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ | 0| Opcode (0x82) | Msg indicator | Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Logical Unit Number (LUN) | + + 12| | +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20| Reserved (0) | +---------------+---------------+---------------+---------------+ 24| Response | Reserved (0) | +---------------+---------------+---------------+---------------+ 28/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 40 For the functions <Abort Task, Abort Task Set, Clear ACA, Clear Task Set, Logical Unit reset>, the target performs the requested Task Management function and sends a SCSI Task Management Response back to the initiator. The target includes all of the information the initiator provided in the SCSI Task Management Message, so the initiator can know exactly which SCSI Task Management Message was serviced. In addition, the target provides a Response indication which may take on the following values: 0 Function Complete 1 Function Rejected For the <Target Reset> function, the target cancels all pending operations. The target may send an Asynchronous Event to all attached initiators notifying them that the target is being reset. The target then closes all of its TCP connections to all initiators (terminates all sessions). 3.6.1. MaxCmdRN - maximum CmdRN the target will accept Satran, Smith, Sapuntzakis, Meth [Page 19] iSCSI June 2000 3.7. Ready To Transfer (RTT) When an initiator has submitted a SCSI Command with data passing from the initiator to the target (WRITE), the target may specify which blocks of data it is ready to receive. In general, the target may request that the data blocks be delivered in whatever order is convenient for the target at that particular instant. This informa- tion is passed from the target to the initiator in the Ready To | Transfer (RTT) message. In order to allow write operations without | RTT, the initiator and target must have agreed to do so by both | sending the AllowNoRTT:yes key-pair attribute to each other (either | during Login or through the Text Command/Response mechanism). Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ | 0| Opcode (0x86) | Reserved (0) | MaxCmdRN | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Reserved (0) | + + 12| | +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20| Desired Data Transfer Length | +---------------+---------------+---------------+---------------+ 24| Data Offset | +---------------+---------------+---------------+---------------+ 28| Target Transfer Tag | +---------------+---------------+---------------+---------------+ 32| Reserved (0) | + + 36| | +---------------+---------------+---------------+---------------+ 40 |3.7.1. MaxCmdRN - maximum CmdRN the target will accept |3.7.2. Desired Data Transfer Length and Data Offset The target specifies how many bytes it wants the initiator to send as a result of this RTT message. The target may request the data from the initiator in several chunks, not necessarily in the origi- nal order of the data. The target, therefore, also specifies a Satran, Smith, Sapuntzakis, Meth [Page 20] iSCSI June 2000 Data Offset indicating the point at which the data transfer should begin, relative to the beginning of the total data transfer. |3.7.3. Target Transfer Tag The target assigns its own tag to each RTT request that it sends to the initiator. This can be used by the target to easily identify data it receives, and can also be used as an RDMA tag [RDMA]. Satran, Smith, Sapuntzakis, Meth [Page 21] iSCSI June 2000 3.8. SCSI Data The typical data transfer specifies the length of the data payload, the Transfer Tag provided by the receiver for this data transfer, | and a buffer offset. The typical SCSI Data packet for WRITE (from | initiator to target) has the following format: Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ 0| Opcode (0x05) | Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ | 8| Transfer Tag | +---------------+---------------+---------------+---------------+ 12| Data Offset | +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 40/ Payload / +/ / +---------------+---------------+---------------+---------------+ The typical SCSI Data packet for READ (from target to initiator) has the following format: Satran, Smith, Sapuntzakis, Meth [Page 22] iSCSI June 2000 Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ | 0| Opcode (0x85) | (0) |S|O|U| MaxCmdRN or (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ | 8| Transfer Tag | +---------------+---------------+---------------+---------------+ 12| Data Offset | +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20| Residual Count | +---------------+---------------+---------------+---------------+ | 24| Command Status|iSCSI Status | StatRN | +---------------+---------------+---------------+---------------+ 28/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 40/ Payload / +/ / +---------------+---------------+---------------+---------------+ 3.8.1. Length The length field specifies the total number of bytes in the follow- ing payload. 3.8.2. Transfer Tag The Transfer Tag identifies the operation to which this data | transfer belongs. When the transfer is from the target to the ini- | tiator, the Transfer Tag is the Initiator Task Tag that was sent | with the SCSI command. When the transfer is from the initiator to | the target, the Transfer Tag is the Target Transfer Tag when RTT is | enabled, or the Initiator Task Tag when RTT is disabled. 3.8.3. Buffer Offset The Buffer Offset field contains the offset of the following data against the complete data transfer. The sum of the buffer offset and length should not exceed the expected transfer length for the command. Satran, Smith, Sapuntzakis, Meth [Page 23] iSCSI June 2000 3.8.4. Flags The last SCSI Data packet sent from a target to an initiator for a particular SCSI command that completed successfully may optionally also contain the Command Status for the data transfer. In this case Sense Data cannot be sent together with the Command Status. If the command completed with an error, then the response and sense data must be sent in a SCSI Response packet and must not be sent in a SCSI Data packet. Byte 2 b0-1 as in an ordinary SCSI Response b2 (S) set to indicate that the Command Status field con- tains status. b3-7 not used (should be set to 0). If the (S) bit is set, then there is meaning to the extra fields in | the SCSI Data packet (MaxCmdRN, Command Status, Residual Count, | StatRN) Satran, Smith, Sapuntzakis, Meth [Page 24] iSCSI June 2000 3.9. Text Command The Text Command is provided to allow the exchange of information and for future extensions. It permits the initiator to inform a target of its capabilities or to request some special operations. Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ 0| Opcode (0x04) | Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 40/ Text / +/ / +---------------+---------------+---------------+---------------+ 3.9.1. Length The length, in bytes, of the Text field. 3.9.2. Initiator Task Tag The initiator assigned identifier for this Text Command. 3.9.3. Text The initiator sends the target a set of key:value pairs in UTF-8 | Unicode format. The key and value are separated by a ':' (0x3A) delimiter. Many key:value pairs can be included in the Text block | by separating them with null ' ' (0x00) delimiters. Some basic | key:value pairs are described in Appendix B. The target responds | by sending its response back to the initiator. The target and ini- | tiator can then perform some advanced operations based on their | common capabilities. Manufacturers may introduce new keys by pre- | fixing them with their (reversed) domain name, for example, com.foo.bar.do_something:0000000000000003 Satran, Smith, Sapuntzakis, Meth [Page 25] iSCSI June 2000 Any key that the target does not understand may be ignored without affecting basic function. Once the target has processed all the key:value pairs, it responds with the Text Response command, list- ing the parameters that it supports. It is recommended that Text operations that will take a long time should be placed in their own | Text command. If the Text Response does not contain a key that was | requested, the initiator must assume that the key was not under- | stood by the target. Satran, Smith, Sapuntzakis, Meth [Page 26] iSCSI June 2000 3.10. Text Response The Text Response message contains the responses of the target to the initiator's Text Command. The format of the Text field matches that of the Text Command. Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ 0| Opcode (0x84) | Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 40/ Text Response / +/ / +---------------+---------------+---------------+---------------+ 3.10.1. Length The length, in bytes, of the Text Response field. 3.10.2. Initiator Task Tag The Initiator Task Tag matches the tag used in the initial Text Command and is used by the initiator to relate the Text Commands with the appropriate Text Responses. 3.10.3. Text Response The Text Response field contains responses in the same key:value format as the Text Command. Appendix B lists some basic Text Com- | mands and their Responses. If the Text Response does not contain a | key that was requested, the initiator must assume that the key was | not understood by the target. Satran, Smith, Sapuntzakis, Meth [Page 27] iSCSI June 2000 3.11. Login Command After establishing a TCP connection between an initiator and a tar- get, the initiator should issue a Login Command to gain further access to the target's resources. Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ | 0| Opcode (0x03) | Reserved (0) | CmdRN or Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ | 8| CID | RecoverCID or 0 | | +---------------+---------------+---------------+---------------+ | 12| Reserved (0) | +---------------+---------------+---------------+---------------+ 16| ISID |TSID | +---------------+---------------+---------------+---------------+ 20/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 40/ Login Parameters in Text Command Format / +/ / +---------------+---------------+---------------+---------------+ 3.11.1. CID a unique id for this connection within the session 3.11.2. For a connection used to recover a lost TCP connection (see later the ID of the failed connection) the initiator provides the CID of the failed connection. A simple target may reject recovery. In this case the initiator will terminate all outstanding commands with a check conditions and reset the target. The initiator may provide some basic parameters in order to enable the target to determine if the initiator may in fact use the | target's resources. The format of the parameters is as specified | for the Text Command. Targets may require keys to indicate the | Domain Name of the initiator and the target, and perhaps also an | Authenticator key. The initiator may also provide additional | parameters to the target in Text Command format, if the initiator | so desires. Keys and their explanations are listed in Appendix B. Satran, Smith, Sapuntzakis, Meth [Page 28] iSCSI June 2000 Whenever desired an initiator will identify its view of the target as in: | Target:<domain-name>[/modifier][:port] implying that the target is known as: scsi://<domain-name>[/modifier] | and it should be connected through port "port" (the default well | known port has an IANA defined value of xx) Initiators can use the | same type of naming implying machine and optional a principal (e.g. | operating system image) as in: | Initiator:<domain-name>[/modifier] | implying that the initiator is known as: | iSCSI://<domain-name>[/modifier] Thus the parameters passed for a plain-text password authentication are: Initiator:<domain-name>[/modifier] | Target:<domain-name>[/modifier] Authenticator:open-sesame The modifier iSCSI-SYS is reserved for administrative functions. ISID and TSID form collectively the SSID (session id). A TSID of 0 indicates a leading connection. Only a leading connection login can | carry session specific parameters, e.g. max-connections-requested, | the maximum immediate data length requested, etc.. CmdRN is signi- | ficant only if TSID is 0 and indicates the starting Command refer- | ence number for this session; it should be 0 for all other | Instances. Satran, Smith, Sapuntzakis, Meth [Page 29] iSCSI June 2000 3.12. Login Response The target responds to the Login Command with a Login Response. It is sufficient for the target to respond with a Status indicating | that the Login is accepted. In fact, the target may completely | ignore the parameters that were sent to it and may provide service | to any initiator that connects to it. The target may also return | parameters using the format of the Text Response opcode, if it so | desires. In particular, the target may want to provide its Authen- | ticator key, so that the initiator can be sure that it is in fact | talking with the correct target. The initiator can request that | the target provide the Authenticator parameter by specifying the | SendAuthenticator:yes key:value pair. Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ | 0| Opcode (0x83) | Reserved (0) | MaxCmdRN or Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 16| ISID | TSID | +---------------+---------------+---------------+---------------+ 20| Reserved (0) | +---------------+---------------+---------------+---------------+ 24| Status | Reserved (0) | +---------------+---------------+---------------+---------------+ 28/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 40/ Login Parameters in Text Command Format / +/ / +---------------+---------------+---------------+---------------+ The format of the Login Response is the same as the Text Response, with the addition of one field. 3.12.1. Status The Status returned in a Login Response is one of the following: 0 accept login (will now accept SCSI commands) 1 reject login 2 additional authentication required 3 reject recovery Satran, Smith, Sapuntzakis, Meth [Page 30] iSCSI June 2000 In the case that the Status is "accept login" the initiator may proceed to issue SCSI commands. In the case that the Status is "reject login" the initiator should immediately close down its end of the TCP connection, thus freeing up the target's port for some other connection. The target also has the option of immediately | closing down its end of the TCP connection. In the case that the | Status is "additional authentication required" the initiator must | provide additional authentication information by issuing the Text | Command with the appropriate key:value pairs. (This may be | required if the authentication method is based on a challenge/response algorithm.) Upon receipt of the necessary authentication, the target will issue a Login Response with the "accept login" Status. SCSI Commands will not be accepted until the target provides a Login Response with the "accept login" | Status. The TSID is an initiator identifying tag set by the tar- | get. A 0 in the returned TSID indicates that either the target | supports only a single connection or that the ISID has already been | used as a leading ISID. In both cases the target is rejecting the | login. MaxCmdRN indicates the maximum CmdRN the initiator should | send. When reaching this number (considering number wrap-around) | the initiator should refrain from sending further commands until | the initiator does receive a new MaxCmdRN that advanced past the | old value. Satran, Smith, Sapuntzakis, Meth [Page 31] iSCSI June 2000 3.13. Ping Command Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ | 0| Opcode (0x0) | Reserved (0) | MaxStatRN | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 40/ Ping Data (optional) / +/ / +---------------+---------------+---------------+---------------+ The Ping Command can be used to verify that a connection is still | active. It may be useful in the case where an initiator has been | waiting a long time for the response to some command, and the ini- | tiator suspects that there is some problem with the connection. When a target receives the Ping Command, it should respond with a Ping Response, duplicating as much of the data as possible that was | provided in the Ping Command (if such data was present). If the | initiator does not receive the Ping Response within some period of | time (determined by the initiator), or if the data returned by the | Ping Response is different from the data that was in the Ping Com- | mand, the initiator may conclude that there is a problem with the | connection. The initiator will then close the connection and may | try to establish a new connection. |3.13.1. MaxStatRN - the next StatRN expected |3.13.2. Length The length of the optional Ping Data. |3.13.3. Initiator Task Tag An initiator assigned identifier for the operation. Satran, Smith, Sapuntzakis, Meth [Page 32] iSCSI June 2000 |3.13.4. Ping Data Binary data that will be reflected in the Ping Response. Satran, Smith, Sapuntzakis, Meth [Page 33] iSCSI June 2000 3.14. Ping Response Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ | 0| Opcode (0x80) | Reserved (0) | MaxCmdRN | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 40/ Return Ping Data / +/ / +---------------+---------------+---------------+---------------+ When a target receives the Ping Command, it should respond with a Ping Response, duplicating the data and Initiator Task Tag that was provided in the Ping Command, if present. Satran, Smith, Sapuntzakis, Meth [Page 34] iSCSI June 2000 |3.15. Third Party Commands -INCOMPLETE! There are some third-party SCSI commands, such as COPY and EXTENDED COPY, that require one target (Target A) to act as an initiator to other targets (e.g., Target B). Some such commands can be extended in a straightforward way to accommodate new forms of addressing, and this should be done to address targets using iSCSI. These extensions are not straightforward for all commands, and they may not be able to encompass the full name space and authentication information needed for iSCSI in some contexts. Thus iSCSI also pro- vides a facility for assigning local short-form aliases to full addressing/authorization information for targets, and the aliases can be used in the SCSI commands and parameter data. The alias information is specified as Text following the header of the SCSI command specifying the third-party command. The header will thus appear as follows: Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ | 0| Opcode (0x01) |1|0|A|Rsv|ATTR | CmdRN | +---------------+---------------+---------------+---------------+ 4| Length (!= 0) | +---------------+---------------+---------------+---------------+ 8| Logical Unit Number (LUN) | + + 12| | +---------------+---------------+---------------+---------------+ 16| Initiator Task Tag | +---------------+---------------+---------------+---------------+ 20| Expected Data Transfer Length | +---------------+---------------+---------------+---------------+ 24| SCSI Command Descriptor Block (CDB) | + + 28| | + + 32| | + + 36| | +---------------+---------------+---------------+---------------+ 40/ Extended CDB if any / +/ Parameters needed for Target B / / / +---------------+---------------+---------------+---------------+ The Length field will not be zero. Rather, it will contain the length of extend CDB if any an the length of the alias information Satran, Smith, Sapuntzakis, Meth [Page 35] iSCSI June 2000 which may include the name of Target B and an Authentication key in Text Command format. An example of the data for this command might be: LocalName:TargetB | FullName:sj.foo.com/controller1 OriginalAuthenticator:open-sesame Upon receiving a third-party command, Target A will perform login operations with the identified targets. In effect, Target A will become an initiator to Target B. Among the parameters provided to Target B, Target A may specify the authentication information from the initiator. The Text provided by Target A when it performs the Login command to Target B may contain the keys Target (referring to Target B) and Initiator (referring to Target A), and it may also contain the keys Authenticator (of Target A), OriginalInitiator and OriginalAuthenticator (referring to authenticator of the original initiator). Satran, Smith, Sapuntzakis, Meth [Page 36] iSCSI June 2000 3.16. Opcode Not Understood Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ 0| Opcode (0x88) | Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ 40/ Header of Bad Message / +/ / +---------------+---------------+---------------+---------------+ 80 It may happen that a target receives a message with an Opcode that it doesn't recognize. This may occur because of a new version of the protocol that defines a new Opcode, or because of some corrup- | tion of a message header. The target returns the header of the | message with the unrecognized opcode as the data of the response. Satran, Smith, Sapuntzakis, Meth [Page 37] iSCSI June 2000 |4. iSCSI Error Handling |4.1. Communications Errors | For any outstanding SCSI command it is assumed that iSCSI in con- | junction with SCSI at the initiator is able to keep enough informa- | tion to be able to rebuild the command PDU, that outgoing data is | available (in host memory) for retransmission while the command is | outstanding and that at target iSCSI and specialized TCP implemen- | tations are able to recover unacknowledged data packets from a | closing connection or, alternatively the target has means to re- | read data from a device-server. It is further assumed that a tar- | get will keep a the "status & sense" for commands it has executed | while the total number of outstanding commands and executed com- | mands Does not exceed it's limit. A target will sequentially number | the delivered responses and thus enable initiators to tell when a | response is missing and what response they miss. | Under those conditions iSCSI will be able to keep a session in | operation provided that it has at is able to keep/establish at | least one TCP connection between the initiator and target in a | timely fashion. Unfortunately the maximum admissible recovery time | is a function of the target and for some devices and communications | networks recovery may be complex and may percolate to upper | software layers. It is assumed that targets and/or initiators will | recognize a failing connection by either transport level means | (TCP) or by a gap in the command stream that does not get filled | for a long time, or by failing iSCSI ping (the later should be used | by periodically by highly reliable implementations). The recovery | involves the following steps: | -abort offending TCP connection(s) (target & initiator) and | recover at target all unacknowledged read-data .ti -2 -create | one or more new TCP connections (within the same session) and | associate all outstanding commands from the failed | connection(s) to the FIRST new connection at both initiator | and target | -the initiator will reissue all outstanding commands with their | original CmdRN and TaskID | -upon receiving the new/restarting commands the target will | resume command execution; for write commands it means request- | ing data retransmission through RTT, for reads retransmitting | recovered data and for "terminated" commands retransmitting | the status & sense while retaining the original StatRN. If | data recovery is not possible the target will either provide | data from the media or redo the operation (if the operation is | not idem-potent the device server may fail the operation). Satran, Smith, Sapuntzakis, Meth [Page 38] iSCSI June 2000 |4.2. Protocol Errors | The authors recognize that mapping framed messages over a "stream" | connection (like TCP) makes the proposed mechanisms vulnerable to | simple software framing errors and introducing framing mechanisms | may be onerous for performance and bandwidth. Command reference | numbers and the above mechanisms for connection drop and reestab- | lishment will help handle this type of mapping errors. |4.3. Session Errors | If all the connections of a session fail and can't be reestablished | in a short time or if initiators detect protocol errors repeatedly | an initiator may choose to terminate a session an establish a new | session (indicating old session termination?). It will terminate | all outstanding request with an iSCSI error indication before ini- | tiating a new session. A target that detects one of the above | errors will take the following actions: | - Reset the TCP connections (close the session). - Abort all Tasks in the task set for the corresponding initia- tor. Satran, Smith, Sapuntzakis, Meth [Page 39] iSCSI June 2000 5. Notes to Implementers | This section notes some of the performance and reliability con- | siderations of the iSCSI protocol. This protocol was designed to | take advantage of the a generic Remote DMA TCP options [RDMA], | although it can still operate effectively without this TCP exten- | sion. 5.1. Small TCP Segments It is recommended that TCP segments be limited in size to no more than 8K bytes. One reason is to ensure that segments won't get bro- ken into smaller packets, thereby possibly breaking the assumptions for RDMA and the information in the RDMA header. Another reason we recommend small segments is to allow a stronger type of checksum, possibly utilizing CRC, which is practical only for smaller seg- ments. 5.2. Multiple Network Adapters The iSCSI protocol assumes that the Task Tags will also serve as RDMA tags. The iSCSI protocol allows multiple connections, not all of which need go over the same network adapter. If multiple network connections are to be utilized with RDMA, the iSCSI protocol command-data-status allegiance to one TCP connection insure that there is no need to replicate information across network adapters or otherwise require them to cooperate. 5.3. Autosense Autosense refers to the automatic return of sense data to the ini- tiator in case a command did not complete successfully. If autosense is turned off, the initiator must explicitly request that sense data be sent to it after some command has completed with a CHECK CONDITION status. The default for iSCSI is to work with | Autosense enabled. Note that even if a SCSI target/LUN does not | support Autosense, it may still be possible for iSCSI to work with | Autosense. This can be accomplished as follows. Whenever a CHECK | CONDITION status is about to be returned, the iSCSI component on | the target immediately queries the target/LUN for the sense data. | iSCSI can then return the sense data to the initiator together with | the CHECK CONDITION status. It is not necessary for iSCSI to wait | for the initiator to explicitly request the sense data; the target | iSCSI code can perform this operation automatically, even for | devices/LUNs that do not ordinarily provide automatic sense data. Satran, Smith, Sapuntzakis, Meth [Page 40] iSCSI June 2000 5.4. TCP RDMA option The TCP RDMA option [RDMA] is an annotation on individual TCP seg- ments that can reduce the number of copies necessary at the receiver. The RDMA option succinctly describes the portion of a TCP payload that holds bulk data. 5.5. TCP Connection Options Some targets may want to inform (or negotiate with) an initiator concerning some parameters related to bandwidth, Quality of Ser- vice, or some other available features on its various network con- nections. These are exchanged between the initiator and the target using Text Commands and Responses. Satran, Smith, Sapuntzakis, Meth [Page 41] iSCSI June 2000 6. Security Considerations 6.1. Data Integrity We assume that end-to-end data integrity can be assured by TCP, by adding a more powerful checksum option whenever this is considered important, or replacing the checksum by a weaker one (or even "nul- lifying it") for applications in which data integrity is not impor- tant and recovery from data errors could be harmful (e.g., audio or video distribution streams). 6.2. Login Process In some environments, a target will not be interested in authenti- cating the initiator. In this case, the target can simply ignore some or all of the parameters sent in a Login Command, and the tar- get can simply reply with a basic Login Response indicating a suc- | cessful login. Some targets may want to perform some kind of | authentication. The Authenticator key is defined for this purpose. | Various authentication schemes can be used, including encrypted | passwords and trusted certificate authorities. Once the initiator | and target are confident of the identity of the attached party, the | established channel is considered secure. It is anticipated that | most target devices will not bother with all of the possible | checks, but the protocol provides sufficient means to perform the | checks, if required by the target. 6.3. IANA Considerations There will be a well known port for iSCSI connections. These well known ports will have to be registered with IANA. A checksum type will also have to be registered with IANA. Satran, Smith, Sapuntzakis, Meth [Page 42] iSCSI June 2000 7. Authors' Addresses Julian Satran Kalman Meth IBM, Haifa Research Lab MATAM - Advanced Technology Center Haifa 31905, Israel Phone +972 4 829 6211 Email: Julian_Satran@vnet.ibm.com meth@il.ibm.com Daniel F. Smith IBM Almaden Research Center 650 Harry Road San Jose, CA 95120-6099, USA Phone: +1 408 927 2072 Email: dfsmith@almaden.ibm.com Costa Sapuntzakis Cisco Systems, Inc. 170 W. Tasman Drive San Jose, CA 95134, USA Phone: +1 408 525 5497 Email: csapuntz@cisco.com Efri Zeidner SANGate Israel efri@sangate.com Comments may be sent to Julian Satran, Daniel Smith, Costa Sapuntzakis, or Kalman Meth. Satran, Smith, Sapuntzakis, Meth [Page 43] iSCSI June 2000 8. References and Bibliography [RDMA] Internet Draft: TCP RDMA option (work in progress) [SAM2] ANSI X3.270-1998, SCSI-3 Architecture Model (SAM-2) [TLS] The TLS Protocol, RFC 2246, T. Dierks et al. [ALTC] Internet Draft: Alternative checksums (work in progress) [CAM] ANSI X3.232-199X, Common Access Method-3 (Cam-3) [CRC] ISO 3309, High-Level Data Link Control (CRC 32) [RFC793] Transmission Control Protocol, RFC 793 | [RFC1122] Requirements for Internet Hosts-Communication Layer, RFC | 1122, R. Braden (editor) [SBC] ANSI X3.306-199X, SCSI-3 Block Commands (SBC) [SCSI2] ANSI X3.131-1994, SCSI-2 [SPC] ANSI X3.301-199X, SCSI-3 Primary Commands (SPC) Satran, Smith, Sapuntzakis, Meth [Page 44] iSCSI June 2000 9. Appendix A - Examples 9.1. Read operation example |Initiator Function| Message Type | Target Function | +------------------+-----------------------+----------------------+ | Command request |SCSI Command (READ)>>> | | | (read) | | | +------------------+-----------------------+----------------------+ | | | Prepare Data Transfer| +------------------+-----------------------+----------------------+ | Receive Data | <<< SCSI Data | Send Data | +------------------+-----------------------+----------------------+ | Receive Data | <<< SCSI Data | Send Data | +------------------+-----------------------+----------------------+ | Receive Data | <<< SCSI Data | Send Data | +------------------+-----------------------+----------------------+ | | <<< SCSI Response |Send Status and Sense | +------------------+-----------------------+----------------------+ | Command Complete | | | +------------------+-----------------------+----------------------+ Satran, Smith, Sapuntzakis, Meth [Page 45] iSCSI June 2000 9.2. Write operation example +------------------+-----------------------+---------------------+ |Initiator Function| Message Type | Target Function | +------------------+-----------------------+---------------------+ | Command request |SCSI Command (WRITE)>>>| Receive command | | (write) | | and queue it | +------------------+-----------------------+---------------------+ | | | Process old commands| +------------------+-----------------------+---------------------+ | | | Ready to process | | | <<< RTT | WRITE command | +------------------+-----------------------+---------------------+ | Send Data | SCSI Data >>> | Receive Data | +------------------+-----------------------+---------------------+ | Send Data | SCSI Data >>> | Receive Data | +------------------+-----------------------+---------------------+ | | <<< RTT | | +------------------+-----------------------+---------------------+ | Send Data | SCSI Data >>> | Receive Data | +------------------+-----------------------+---------------------+ | | <<< SCSI Response |Send Status and Sense| +------------------+-----------------------+---------------------+ | Command Complete | | | +------------------+-----------------------+---------------------+ Satran, Smith, Sapuntzakis, Meth [Page 46] iSCSI June 2000 10. Appendix B - Login/Text keys 10.1. Target Target:domainname[/modifier] Examples: Target:disk-array.sj-bldg-h.cisco.com Target:disk-array.sj-bldg-h.cisco.com/disk3 This key is provided by the initiator of the TCP connection to the remote endpoint. The Target key specifies the domain name of the target, since that information is not available from the TCP layer. | The target is not required to support this key. The initiator | should send this key in the first login message. The Target key | might be used by the target to learn the intended initiator view of | the target. 10.2. Initiator | Initiator:[domainname[/modifier]] Examples: Initiator:sample.foobar.org Initiator:cluster.foobar.org/machine1 Initiator: The Initiator key enables the initiator to identify itself to the remote endpoint. The domain name should be that of the initiator. A zero-length domain name is interpreted as "other side of TCP con- nection". The target may silently ignore this key if it does not | support it. For more security, a certificate-based protocol [TLS] | may be used on the channel and take precedence over this protocol. 10.3. Authenticator | Authenticator:<UTF8-String> Examples: Authenticator:open-sesame The authenticator is a secret that the initiator uses to gain access to the target's LUNs. Satran, Smith, Sapuntzakis, Meth [Page 47] iSCSI June 2000 10.4. SendAuthenticator | SendAuthenticator:yes Response: Authenticator:<UTF8-String> Exam- | ples: SendAuthenticator:yes -> Authenticator:alakazam The SendAuthenticator key is used to request from the party on the other side of the TCP connection to send its Authenticator. iSCSI devices may refuse to grant access until proper authentication has been performed by the parties involved. 10.5. AllowNoRTT | AllowNoRTT:<yes|no> Response: AllowNoRTT:<yes|no> Examples: AllowNoRTT:yes -> AllowNoRTT:yes The AllowNoRTT key is used to allow an initiator to send data to a target without the target having sent an RTT to the initiator. The default action is that RTT is required, unless both the initiator and the target send this key-pair attribute specifying AllowNoRTT:yes. Once AllowNoRTT has been set to 'yes', it cannot be set back to 'no'. 10.6. OriginalInitiator | OriginalInitiator:[domainname[/modifier]] Examples: OriginalInitiator:sample.foobar.org The OriginalInitiator key is used to perform a proxy login from one target to another target in order to perform a third-party opera- tion (like COPY) for some initiator. The first target acts as the initiator for the second target, but it must provide the authoriza- tion information of the original initiator. 10.7. Target2 | Target2:domainname[/modifier] Examples: Satran, Smith, Sapuntzakis, Meth [Page 48] iSCSI June 2000 Target2:sample.foobar.org Target2:sample.foobar.org/raid2 The Target2 key is used in a third-party SCSI command (like COPY) between targets that do not lie on the same SCSI fabric. The ini- tiator must specify the name of the distant target to the original target, so that the original target can Login to the distant target and then perform the third-party command. Expires 15 December 2000 Satran, Smith, Sapuntzakis, Meth [Page 49] #! use strict; my $src = $ARGV[0]; my $dst = $ARGV[1]; open(SRC, $src); $src .= ".tmp"; open(SRC2, ">$src"); open(DST, $dst); $dst .= ".tmp"; open(DST2, ">$dst"); # # Remove unnecessary lines from each and store mapping # my @linemap = (); my @changes = (); my $oldlineno = -1; my $newlineno = 0; my $state = 0; while(<SRC>) { $oldlineno++; $changes[$oldlineno] = 0; if ($state == 1) { $state = 0; next; } if (/\014/) { $state = 1; next; } next if (/\[Page/); next if (/^\s*$/); $linemap[$newlineno] = $oldlineno; $newlineno++; print SRC2 $_; } while (<DST>) { if ($state == 1) { $state = 0; next; } if (/\014/) { $state = 1; next; } next if (/\[Page/); next if (/^\s*$/); print DST2 $_; } close DST2; close DST; close SRC2; open (DIFFS, "diff -w $dst $src |"); while(<DIFFS>) { # print ; next if (!/^\d/); /c(\d+)(,(\d+))?/; # print $1 . " " . $3 . "\n"; next if (!defined $1); my $start = $1; my $end = $3; if (!defined $end) { $end = $start; } for (my $idx = $start - 1; $idx <= $end - 1; $idx++) { # print $idx . "\n"; $changes[$linemap[$idx]] = 1; } } seek SRC, 0, 0; $oldlineno = 0; while (<SRC>) { if ($changes[$oldlineno]) { print "|"; } else { print " "; } print; $oldlineno++; }
Home Last updated: Tue Sep 04 01:08:12 2001 6315 messages in chronological order |