|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] iSCSI: comments/changes to draft-ietf-ips-iscsi-00.txtHi Julian, Good job on the enormous effort bringing the iSCSI draft up to date. I have a few general comments and some specific ones. I have noted proposed changes to the draft in "diff -u" format with comments preceeded by '!'. (I.e., additions are marked with '+', removals with '-', comments/questions with "+! ...".) Brief general (major architectural) comments: Mandating the use of the TCP urgent pointer is, in my opinion, very bad. It breaks TCP independence and is also unrealiable in practice. Recommending its use is good though. There are a lot of Reference Numbers in this new draft. I can immediately see the utility of *CmdRN, but remain unconvinced about the necessity of the others. I would like to see the initiator be able to send the data PDU(s) on a different TCP stream(s) than the command PDU went on. This will allow emulation of the asymmetric model. This feature should be enabled by a Text command just in case any strange targets cannot cope with this. (I realize the intent is to keep the Initiator Task Tags local to a network adapter, but I think this should be an implementation decision, not a protocol mandate.) Thanks for the good work. Daniel F. Smith Diffs follow. -- IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120-6099, USA K65B/C2 Phone: +1(408)927-2072 Fax: +1(408)927-3010 --- draft-ietf-ips-iscsi-00.txt.orig Tue Nov 7 12:12:08 2000 +++ draft-ietf-ips-iscsi-00.txt Tue Nov 7 18:32:31 2000 @@ -148,11 +148,11 @@ Satran Standards-Track, May 2001 3 iSCSI November, 2000 - +! Any table of contents? 1. Overview 1.1 SCSI Concepts @@ -162,20 +162,22 @@ the SCSI architecture. At the highest level, SCSI is a family of interfaces for requesting services from I/O devices, including hard drives, tape drives, CD and DVD drives, printers, and scanners. In SCSI parlance, an individual - I/O device is called a ôlogical unitö. + I/O device is called a "logical unit". +! I don't know what quote characters you were using, but they were weird +! blobs on my terminal. Replaced throughout with double-quotes. SCSI is a client-server architecture. Clients of a SCSI interface are - called ôinitiatorsö. Initiators issue SCSI ôcommandsö to request - service from a logical unit. The ôdevice serverö on the logical unit + called "initiators". Initiators issue SCSI "commands" to request + service from a logical unit. The "device server" on the logical unit accepts SCSI commands and executes them. - A ôSCSI transportö maps the client-server SCSI protocol to a specific + A "SCSI transport" maps the client-server SCSI protocol to a specific interconnect. Initiators are one endpoint of a SCSI transport. The - ôtargetö is the other endpoint. A ôtargetö can have multiple LUs + "target" is the other endpoint. A "target" can have multiple LUs behind it. Each logical unit has a number called a LUN. A SCSI task is a SCSI command or possibly a linked set of SCSI commands. Some LUNs support multiple pending (queued) tasks. The queue of tasks is managed by the target, though. The target uses an @@ -206,11 +208,11 @@ In keeping with similar protocols, the initiator and target divide their communications into messages. This document will use the term - ôiSCSI protocol data unitö (iSCSI PDU) for these messages. + "iSCSI protocol data unit" (iSCSI PDU) for these messages. 1.2.1 Layers & Sessions The following conceptual layering model is used in this document to specify initiator and target actions and how those relate to @@ -230,17 +232,18 @@ target form a session (loosely equivalent to a SCSI I-T nexus). A session is defined by a session ID (composed of an initiator part and a target part). TCP connections can be added and removed from a session. Connections within a session are identified by a connection ID (CID). Across all connections within a session an initiator will - see one "target image" - all target identifying elements like LUN are + see one "target image" - all target-identifying elements like LUN are the same. Also across all connections within a session a target will see one "initiator image" - all initiator identifying elements like - Initiator Task Tag are the same. + Initiator Task Tag are taken from the same pool. +! Hurrah! This is good. An iSCSI target MUST support at least one TCP connection. An iSCSI - initiator SHOULD support several connections in a session. + initiator MAY support several connections in a session. 1.2.2 Ordering and iSCSI numbering iSCSI uses Command, Status and Data numbering schemes. @@ -276,35 +279,46 @@ iSCSI supports ordered command delivery within a session. All commands (initiator-to-target) and responses (target-to-initiator) are numbered. Any SCSI activity is related to a task (SAM-2). The task is identified by the Initiator Task Tag for the life of the - task. Commands in transit from the initiator SCSI layer to the + task. + + Commands in transit from the initiator SCSI layer to the target SCSI layer are numbered by iSCSI and the number is carried by the iSCSI PDU as CmdRN (Command-Reference-Number). The numbering is session-wide. All iSCSI PDUs that have a task association carry this number. CmdRNs are allocated by the initiator iSCSI within a 32 bit unsigned counter (modulo 2**32). The value 0 is reserved and used to mean immediate delivery. Comparisons and arithmetic on CmdRN SHOULD use Serial Number Arithmetic as defined in - [RFC1982] where SERIAL_BITS = 32. + [RFC1982] where SERIAL_BITS = 32. + The target may choose to deliver some task management commands for immediate delivery. The means by which the SCSI layer may request immediate delivery for a command or by which iSCSI will decide by itself to mark a PDU for immediate delivery are outside the scope of this document. + CmdRNs are significant only during command delivery to the target. Once the device serving part of the target SCSI has received a command, CmdRN ceases to be significant. During command delivery to - the target, the allocated numbers are unique session wide. The - initiator and target are assumed to have three registers that define - the allocation mechanism - CmdRN - the current command reference - number advanced by 1 on each command shipped; ExpCmdRN - the next - expected command by the target - acknowledges all commands up to it; - MaxCmdRN - the maximum number to be shipped - MaxCmdRN - ExpCmdRN - defines the queuing capacity of the receiving iSCSI layer. + the target, the allocated numbers are unique session wide. + + The initiator and target are assumed to have three registers that define + the allocation mechanism. + CmdRN: the current command reference number advanced by 1 on each + command shipped. + ExpCmdRN: the next expected command by the target, which acknowledges + all commands up to it. + MaxCmdRN: the maximum number to be shipped. The + queuing capacity of the receiving iSCSI layer is the difference + between MaxCmdRN and ExpCmdRN. + The target SHOULD NOT transmit a MaxCmdRN that is more than 2**31 - 1 +! Does this mean that T can reduce its MaxCmdRN? E.g., when there +! are more connections opened to it? Otherwise, why this restriction? above the last ExpCmdRN. CmdRN can take any value from ExpCmdRN to Satran Standards-Track, May 2001 6 @@ -315,11 +329,11 @@ outside this range or duplicates within the range not flagged with the retry bit (the X bit in the opcode). The target and initiator registers MUST uphold causal ordering. iSCSI initiators MUST implement the command/request numbering scheme - only if they support more than one connection per session (as even + if they support more than one connection per session (as even sessions with a single connection may be expanded beyond one connection). Command numbering for sessions that will only be made up of one connection is optional. iSCSI initiators utilizing a single @@ -337,10 +351,11 @@ Responses in transit from the target to the initiator are numbered. The StatRN (Status Reference Number) is used for this purpose. StatRN is a counter maintained per connection. ExpStatRN is used by the initiator to acknowledge status. + To enable command recovery the target MAY maintain enough state to enable data and status recovery after a connection failure. A target can discard all the state information maintained for recovery after the status delivery is acknowledged through ExpStatRN. A large difference between StatRN and ExpStatRN may indicate a failed @@ -349,12 +364,13 @@ Initiators and Targets MUST support the response-numbering scheme regardless of the support for command recovery. 1.2.2.3 Data PDU numbering - Incoming Data PDUs MAY be numbered by a target to enable fast + Data PDUs MAY be numbered by a target to enable fast recovery of long running READ commands. + Data PDUs are numbered with DataRN. NOP command PDUs carrying the same Initiator Tag as the Data PDUs are used to acknowledge the incoming Data PDUs with ExpDataRN. Support for Data PDU acknowledgement and the maximum number of unacknowledged data PDUs are negotiated at login. @@ -365,14 +381,16 @@ In a PDU carrying both data and status, the field is used for StatRN and the last set of data blocks is implicitly acknowledged when Status is acknowledged. +! Gag! How many *RNs are there?!? I dislike them except for CmdRN. 1.2.3 Timers and timeouts - Initiators MUST implement the following timers: + Initiators SHOULD implement the following timers at some level: +! Normally in the system's SCSI layer. May be infinite (hence SHOULD). - T1 - Command delivery timer - T2 - Status delivery timer - T3 - Data delivery timer @@ -397,18 +415,20 @@ 1.2.4 iSCSI Login The purpose of iSCSI login is to enable a TCP connection for iSCSI - use, authenticate the parties, negotiate the sessionÆs parameters, + use, authenticate the parties, negotiate the session's parameters, +! Strange non-ASCII character for apostrophe. Quote inserted. open a security association protocol and mark the connection as belonging to an iSCSI session. A session is used to identify to a target all the connections with a given initiator. The targets listen on a well-known TCP port for incoming connections. The initiator begins the login process by connecting to that well-known TCP port. +! This well-known TCP port to be defined by... As part of the login process, the initiator and target MAY wish to authenticate each other and set a security association protocol for the session. This can occur in many different ways and is subject to @@ -418,36 +438,43 @@ negotiation. Negotiation and security associations executed before the Login Command are outside the scope of this document although they might realize a related function (e.g., establish a IPsec or TLS - session). The Login Command starts the iSCSI Login Phase. Within the + session). +! I have no idea what the prior sentence means. Maybe "This document will +! not cover the specific negotiation parameters of the chosen security +! scheme. However, a security wrapper, e.g., SSL/TLS or IPsec, may +! decide to use no security features associated with the iSCSI transport." + + The Login Command starts the iSCSI Login Phase. Within the Login Phase, negotiation is carried on through parameters of the Login Command and Response and optionally through intervening Text Commands and Responses. The Login Response concludes the Login Phase. Once suitable authentication has occurred, the target MAY authorize the initiator to send SCSI commands. How the target chooses to authorize an initiator is beyond the scope of this document. The target indicates a successful authentication and authorization by sending a login response with "accept login". Otherwise, it sends a - response with a ôlogin rejectö, indicating a session is not + response with a "login reject", indicating a session is not established. + It is expected that iSCSI parameters will be negotiated after the security association protocol is established if there is a security association. - The login message includes a session ID - composed with an initiator part ISID and a target part TSID. For a new session, the TSID is null. As part of the response, the target will generate a TSID. Session specific parameters can be specified only for the first login - of a session (TSID null)(e.g., the maximum number of connections that + of a session (TSID null) (e.g., the maximum number of connections that can be used for this session). Connection specific parameters (if any) can be specified for any login. Thus, a session is operational once it has at least one connection. - Any message except login and text sent on a TCP connection before + Any message except login response and text response sent on a + TCP connection before this connection gets into full feature phase at the initiator SHOULD be ignored by the initiator. Any message except login and text reaching a target on a TCP connection before the full feature phase MUST be silently ignored by the target. @@ -475,27 +502,36 @@ and it MAY be selected by omission (i.e. <key>:none MAY be omitted). The general format is: Offer-> <key>:(<value1>,<value2>,...,<valuen>) +! Does the comma ',' now become a special character? I was hoping to limit +! special escaped characters to ':' (key special) and nul '\0' (key and +! value special). Answer-> <key>:<valuex> 1.2.6 iSCSI Full Feature Phase Once the initiator is authorized to do so, the iSCSI session is in - iSCSI full feature phase. The initiator may send SCSI commands and + iSCSI full feature phase. The initiator may now send SCSI commands and data to the various LUNs on the target by wrapping them in iSCSI - messages that go over the established iSCSI session. For SCSI + messages that go over the established iSCSI session. + + For SCSI commands that require data and/or parameter transfer, the (optional) data and the status for a command must be sent over the same TCP connection that was used to deliver the SCSI command (we call this "connection allegiance"). Thus if an initiator issues a READ command, the target must send the requested data, if any, followed by the status to the initiator over the same TCP connection that was used to deliver the SCSI command. If an initiator issues a WRITE command, the initiator must send the data, if any, for that command +! Why must data be sent over the same TCP connection? I'd like to send +! it over a different one to get the assymetic functionality. I.e., +! send a bunch of commands over stream A, with unsolicited data sucked +! out by the target on stream B. and the target MUST return the status over the same TCP connection that was used to deliver the SCSI command. However consecutive commands that are part of a SCSI linked commands task MAY use different connections - connection allegiance is strictly per-command and not per-task. During iSCSI Full Feature Phase, the initiator and @@ -504,17 +540,23 @@ - user data or command parameters) is sent as either unsolicited data or solicited data. Unsolicited data can be part of an iSCSI command PDU ("immediate data") or an iSCSI data PDU. An initiator may send only one unsolicited data item (immediate or in a separate PDU) - all subsequent data items have to be solicited. Solicited data are sent +! Why? I think this should be relaxed. If I has the option of sending +! an arbitrarily long data PDU then there's no advantage to be gained +! by disallowing unsolicited shorter data PDUs, but much to lose. in response to Ready To Transfer (R2T) PDUs. Targets operate in either solicited (R2T) data mode or unsolicited (non R2T) data mode. An initiator MUST always honor an R2T data request. It is considered an error for an initiator to send unsolicited data PDUs to a target operating in R2T mode (only solicited data). It is also an error for an initiator to send more than one unsolicited data PDU (whether - immediate or as a separate PDU). An initiator MAY request, at login, + immediate or as a separate PDU). + + An initiator MAY request, at login, +! (Again, disagree.) to send immediate data blocks of any size. If the initiator requests a specific block size the target MUST indicate the size of immediate data blocks it is ready to accept in its response. Beside iSCSI, SCSI also imposes a limit on the amount of unsolicited data a target @@ -523,35 +565,51 @@ iSCSI November, 2000 is willing to accept. The iSCSI immediate data limit MUST not exceed the SCSI limit. - +! (Yet again, disagree. TCP has perfectly good flow control, while SCSI may +! not. Consider the really-really long network connection and the tape +! drive with an 8KB data limit. TCP will do much better than SCSI here.) + A target SHOULD NOT silently discard data and request retransmission through R2T. Initiators MUST NOT perform any score boarding for data +! What is score boarding? and the residual count calculation is to be performed by the targets. - Incoming data is always solicited. SCSI Data packets are matched to + Data returned to the initiator is always solicited. SCSI Data packets + are matched to their corresponding SCSI commands by using Tags that are specified in the protocol. - Initiator tags for pending commands are unique initiator-wide for a - session. Target tags for pending commands are unique LU-wide for the - session; together with the LUN they form a target-wide unique + Initiator tags for pending commands are unique session-wide for an + initiator. Target tags for pending commands are unique session-wide + for the LU. Together with the LUN they form a target-wide unique +! Did you mean "for the LU" or "for the target" or (even) "for the Initiator +! Task Tag"? As worded, it means we have to match +! the Target Tag with the LUN. Sigh---that's a tricky 12-byte table index. +! It can be mitigated to 8 bytes with an Initiator Tag/Target Tag index. +! I'd prefer a 4-byte index by making it "for the target". composite tag for a session. The above mechanisms are designed to accomplish efficient data delivery and a large degree of control over - the data flow. iSCSI initiators and targets MUST also enforce some + the data flow. + + iSCSI initiators and targets MUST also enforce some ordering rules to achieve deadlock-free operation. Unsolicited data MUST be sent on every connection in the same order in which commands were sent. If the amount of data exceeds the amount allowed for unsolicited write data, the specific connection MUST be stalled - no new data will be sent on the specific connection until initiator receives an R2T iSCSI PDU from the target. A target +! (Again, disagree.) receiving data out of order or observing a connection violating the above rules MUST terminate the session. +! (Agree here, though I would use SHOULD. Otherwise nasty people might +! start inserting random data PDUs into iSCSI streams just to disrupt +! sessions and force logins.) Each iSCSI session to a target is treated as if it originated from a - different initiator. + different and logically independent initiator. 1.2.7 iSCSI Connection Termination Connection termination is assumed an exceptional event. Graceful TCP connection shutdowns are done by sending TCP FINs. @@ -559,10 +617,15 @@ outstanding tasks that have allegiance to the connection. A target SHOULD respond rapidly to a FIN from the initiator by closing it's half of the connection as soon as it has finished all outstanding tasks that have allegiance to the connection. Connection termination with outstanding tasks may require recovery actions. +! I would still like a close command (with a CmdRN). Just for completeness. +! (I know it isn't strictly necessary, but it does allow a target to close +! the TCP stream for the initiator, which may have useful FIN_WAITx +! properties in TCP and other transport protocols. In HTTP, the server +! closes the connection, not the client.) Connection termination is also required as prelude to recovery. By terminating a connection before starting recovery, initiator and target can avoid having stale PDUs being received after recovery. In this case, the initiator will send a LOGOUT request on any of the @@ -579,15 +642,15 @@ 1.2.8 Naming & mapping Text string names are used in iSCSI to: - - provide explicitly a transportID for the target to enable the - later to recognize the initiator because the conventional IP- - address and port tuple is inaccurate behind firewalls and NAT + - provide explicitly a initiatorID for the target to enable the + target to recognize the initiator because the conventional IP- + address and port pair is inaccurate behind firewalls and NAT devices (key - initiator) - - provide a target selector - targetID for simple + - provide a targetID for simple configurations hiding several targets behind an IP-address and port (key - target) - provide a symbolic address for source and destination targets in third party commands (through the map command) @@ -598,39 +661,44 @@ relate them to other names and name handling mechanisms the following syntax for names SHOULD be used <domain-name>[/modifier] - Where domain-name follows DNS rules and the modifier is an + Where domain-name follows DNS (or dotted IP) rules and the modifier is an alphanumeric string (N.B. the whole pattern follows the URL structure) Some mapped names for third party command use might have to include a port number. For those the following syntax SHOULD be used: - <domain-name>[[/modifier]:[port]] + <domain-name>[:<port>][/modifier] The text to address transformation, wherever needed, will be performed through available name translation services (DNS servers, - LDAP accessible directories etc.) + LDAP accessible directories etc.). To enable simple devices to operate without name-to-address conversion services the following conventions SHOULD be used: A domain name that contains exactly four numbers separated by dots (.), where each number is in the range 0 through 255, will be interpreted as an IPv4 address. A domain name that contains more than four, but at most 16 numbers separated by dots (.), where each number is in the range 0 through 255, will be interpreted as an Ipv6 address. +! These are covered in RFCs. I forget the numbers for citation... B-) Satran Standards-Track, May 2001 12 iSCSI November, 2000 - + Examples of DNS addresses/names: + + tapedrive37.acme.com/cart3 + tapedrive38.acme.com + Examples of IPv4 addresses/names: 10.0.0.1/diskfarm1 10.0.0.2 @@ -643,38 +711,40 @@ For management/support tools as well as naming services that use a text prefix to express the protocol intended (as in http:// or ftp://) the following form MAY be used: - iSCSI://<domain-name>[[/modifier]:[port]] + iSCSI://<domain-name>[:<port>][/modifier] +! Is "iSCSI" case-sensitive? Examples: iSCSI://diskfarm1.acme.com iSCSI://computingcenter.acme.com/diskfarm1 - - - + iSCSI://printroom.acme.com:4002/scanners/drum12 + When a target has to act as an initiator for a third party command, it MAY use the initiator name it learned during login as required by the authentication mechanism to the third party. To address targets and logical units within a target, SCSI uses a fixed length (8 bytes) uniform addressing scheme; in this document, we call those addresses SCSI reference addresses (SRA). +! This is a LUN? Right? To provide the target with the protocol specific addresses (iSCSI or FC) iSCSI uses a Map Command; the Map command sends the managing target the protocol specific addresses and gets from the target the SRAs to use in subsequent commands. For iSCSI, a protocol specific address is a TCP address and a modifier. After mapping, iSCSI will be provided with a handle to the address in standard SCSI format. + The Map command is useful for third party commands. 1.2.9 Message Framing - iSCSI presents an mapping of the SCSI protocol onto TCP. This + iSCSI presents a mapping of the SCSI protocol onto TCP. This encapsulation is accomplished by sending iSCSI PDUs that are of varying length. Unfortunately, TCP does not have a built-in mechanism Satran Standards-Track, May 2001 13 @@ -711,13 +781,17 @@ length) makes it impossible to find message boundaries in subsequent TCP segments. The missing TCP segment must be received before any following segments can be processed. The iSCSI protocol uses the urgent bit in the TCP header to delineate - iSCSI messages. The first byte of every iSCSI message MUST be marked + iSCSI messages. The first byte of every iSCSI PDU sent SHOULD be marked "urgent". The result is the TCP urgent pointer will point to the first byte of the iSCSI message in the TCP segment. +! Not strictly true. There can only be one outstanding urgent pointer, +! and it always points to the most recently queued urgent byte. Hence, +! analysers will get a pointer to _a_ start-of-PDU, but not the _next_ +! start-of-PDU. When a large iSCSI message is sent, the first TCP segment will contain the iSCSI header, but the remaining TCP segments will not contain any iSCSI framing information. To minimize the amount of buffering required when an iSCSI header is lost, it is recommended @@ -737,23 +811,23 @@ interpretation is being used on the data received. Bit 7 in the first byte of the iSCSI message (F bit in the opcode field) that shall always be zero. Bit 7 in the following byte (opcode specific fields) shall always be one. When an iSCSI implementation receives an out of +! Ugh! B-P order TCP segment with the Urgent pointer defined, it shall look at the byte pointed to by the Urgent pointer. If the bit is clear, the sender is RFC1122 compliant. If the bit is set, the sender has implemented the BSD interpretation, and must "back up" one byte to find the beginning of the iSCSI message. - - - - - - - - +! I am strongly against mandating this policy. +! It is a clever and useful idea, but impractical. There are sandboxed +! TCP/socket implementations that will not allow the urgent bit to be set +! and it's also possible for the TCP stream to be reframed and urgent lost. +! Lose the "MUST" and I'll be happy! (After all, if the target receives +! a non-complient data PDU (i.e., no urgent flag) it is supposed to drop the +! stream, whereas this section is purely informational and for convenience.) @@ -798,11 +872,11 @@ are to be represented in network byte order (i.e., big endian). Any bits not defined should be set to zero. 2.1 Template Header and Opcodes - All iSCSI PDUs begin with a 48-byte header. Additional data appears, + All iSCSI PDUs begin with a 48-byte header. Additional data appears, as necessary, beginning with byte 48. The fields of Opcode and Length appear in all iSCSI PDUs. In addition, the Initiator Task tag, Logical Unit Number, and Flags fields, when used, always appear in the same location in the header. @@ -810,12 +884,12 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| Opcode |1|X| Opcode-specific fields | - | | |P| | + 0|0| Opcode |1|X| Opcode-specific fields | + | | | |P| | +---------------+---------------+---------------+---------------+ 4| Length of Data (after 48 byte Header) | +---------------+---------------+---------------+---------------+ 8| LUN or Opcode-specific fields | + + @@ -824,23 +898,20 @@ 16| Initiator Task Tag or Opcode-specific fields | +---------------+---------------+---------------+---------------+ 20/ Opcode-specific fields / +/ / +---------------+---------------+---------------+---------------+ - 48| Header digest (optional-constant-length) | + 48| Header digest (optional, constant-length) | +---------------------------------------------------------------+ - +n/ / - +/ Data (optional) / + +n/ Data (optional) / + +/ / +---------------------------------------------------------------+ - m| Data digest (optional-variable-length) | + m/ Data digest (optional, variable-length) / + +/ / +---------------------------------------------------------------+ -2.1.1 F bit - - If set to 1 indicates BSD semantic for the urgent pointer. - If set to 0 indicates RFC 1122 semantic for the urgent pointer. - +! Section 2.1.1 deleted. Unnecessary. Satran Standards-Track, May 2001 16 iSCSI November, 2000 @@ -871,10 +942,13 @@ 0x02 SCSI Task Management Command 0x03 Login Command 0x04 Text Command 0x05 SCSI Data (for WRITE operation) 0x06 NOP Command (from initiator to target) +! Can we rename the four NOP commands 0x00, 0x06, 0x40 and 0x46? Or comment +! them better? I presume NOP-In preceeds NOP-Out and (independently) +! NOP Command preceeds NOP Response. Did 0x06 used to be ping? 0x07 Map Command 0x08 Logout Command Valid target opcodes are: @@ -906,12 +980,12 @@ 2.1.3 Opcode-specific fields These fields have different meanings for different messages. Bit 7 of the second byte MUST be 1 and bit 6 of the second byte is - used as a retry indicator for commands (X bit) or Poll bit and must - be 0 in all other iSCSI PDUs + used as a retry indicator for commands (X bit) or Poll bit (P) and must + be 0 in all other iSCSI PDUs. 2.1.4 Length The Length field indicates the number of bytes, beyond the first 48 bytes, that are being sent together with this message header. The @@ -926,10 +1000,11 @@ Number (LUN) field identifies which Logical Unit.. If the opcode does not relate to a Logical Unit, this field either is ignored or may be used for some other purpose. The LUN field is 64-bits in accordance with [SAM2]. The exact format of this field can be found in the [SAM2] document. +! Is this field also the SRA holder? 2.1.6 Initiator Task Tag The initiator assigns a Task Tag to each SCSI task that it issues. This tag is a session-wide unique identifier that can be used to @@ -941,12 +1016,12 @@ be the least significant n-bits of the Initiator Task Tag. For example: tag:16 - means that only the last 16 bits of the Initiator Task Tag will be - used (the first 16 have to be 0). + means that only the low 16 bits of the Initiator Task Tag will be + used (the high 16 have to be 0). Satran Standards-Track, May 2001 18 iSCSI November, 2000 @@ -1012,11 +1087,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x01 |1|X|R|W|0|ATTR | Reserved (0) | AddCDB + 0|0| 0x01 |1|X|R|W|0|ATTR | Reserved (0) | AddCDB | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Logical Unit Number (LUN) | + + @@ -1040,11 +1115,10 @@ 2.2.1 Flags & Task Attributes The flags field for a SCSI Command is: - b7 1 MUST be 1 for framing b6 Retry (X) b5 (R) set to 1 when input data is expected b4 (W) set to 1 when output data is expected b3 Reserved (MUST be 0) @@ -1066,20 +1140,24 @@ 4 ACA 2.2.2 AddCDB Additional CDB length (over 16) in units of 4 bytes. +! I thought the CDB was limited to 256 bytes by SAM2 (hunch). This +! could just be changed to be equal to the actual CDB length. It would +! be nicer that way. 2.2.3 CmdRN - Command Reference Number Enables ordered delivery across multiple connections in a single session. 2.2.4 ExpStatRN - Expected Status Reference Number - Command responses up to ExpStatRN -1 (mod 2**32) have been received + Command responses up to ExpStatRN-1 (mod 2**32) have been received (acknowledges status) on the connection. +! I'm still not convinced that the target requires this information. 2.2.5 Expected Data Transfer Length For unidirectional operations, the Expected Data Transfer Length field states the number of bytes of data involved in this SCSI @@ -1092,10 +1170,13 @@ For bi-directional operations, this field states the number of data bytes involved in the outbound transfer. For bi-directional operations, an additional field indicating the Expected Bidi-Read Data Transfer Length is following the (possibly extended) CDB as shown bellow: +! I'd prefer this swapped.... Have the return length in the main PDU +! and the send length in the extended part. Just personal preference +! (from implementation hassles). +---------------+---------------+---------------+---------------+ 48/ Additional CDB (if any) / +/ / +---------------+---------------+---------------+---------------+ @@ -1135,18 +1216,20 @@ Some SCSI commands require additional parameter data to accompany the SCSI command. This data may be placed beyond the 48-byte boundary of the iSCSI header. Alternatively, user data (as from a WRITE operation) can be placed in the same PDU (both cases referred to as immediate data). + +! Form feed here? 2.3 SCSI Response Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x41 |1|Rsvd |o|u|O|U| Reserved (0) | + 0|0| 0x41 |1|Rsvd |o|u|O|U| Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Reserved (0) | + + @@ -1166,11 +1249,11 @@ iSCSI November, 2000 32| MaxCmdRN | +---------------+---------------+---------------+---------------+ - 36| Command Status| Reserved (0) | + 36| Command Status| iSCSI status | Reserved (0) | +---------------+---------------+---------------+---------------+ 40| Resp_length | Sense_length | +---------------+---------------+---------------+---------------+ 44| Bidi-Read Residual Count | +---------------+---------------+---------------+---------------+ @@ -1181,21 +1264,22 @@ 2.3.1 Byte 1 - Flags b0 (U) set for Residual Underflow. In this case, the Basic Residual Count indicates how many bytes were not transferred out of those expected to be transferred. - b1 (O) set for Residual Overflow. In this case, the Bsic + b1 (O) set for Residual Overflow. In this case, the Basic Residual Count indicates how many bytes could not be transferred because the initiator's Expected Data Transfer Length was too small. - b2 (u) same as b0 but for the read-part of a bi-directional + b2 (u) same as b0 but for the write-part of a bi-directional operation b3 (o) same as b1 but for the read-part of a bi-directional operation - b4-6 not used (SHOULD be set to 0) + b4-6 not used (set to 0) - Bits O and U are mutually exclusive and so are bits o and u. + Only one of bits O and U may be set. Similarly with bits o and u. +! "Mutually exclusive" might imply "O xor U = 1"! 2.3.2 Basic Residual Count The Basic Residual Count field is valid only in case either the U bit or the O bit is set. If neither bit is set, the Basic Residual Count @@ -1219,24 +1303,36 @@ iSCSI November, 2000 transferred in because the initiator's Expected Bidi-Read Transfer Length was too small. +! It occurs to me that it would be much simpler to have two fields and two +! flags. That is, WriteDelta and ReadDelta. The sign of the deltas would +! be indicated by the flags O and I. Normally both fields would be zero. +! This should work for R, W and R/W commands. (We might even want to +! specify signed deltas to eliminate the flags altogether.) -2.3.4 Command Status +2.3.4 Command Status and iSCSI Status The Command Status field is used to report the SCSI status of the command (as specified in [SAM2]). + + The iSCSI Status field is used to report problems in transporting the + command to the LU. The following values are currently defined. + + 0 no error + 1 LU not found (could not start command phase to LU) 2.3.5 Resp_length - Response length 2.3.6 Sense_length - Length of sense data 2.3.7 Response and/or Sense Data iSCSI targets MUST support and enable autosense. If the Command Status was CHECK CONDITION (0x02), then the Response and/or Sense +! Hurrah! Take a stand! Data field will contain sense data for the failed command after the response data. Some sense codes will relate to iSCSI check conditions (e.g. excessive number of outstanding commands, immediate data blocks too large etc.). The Length parameters specify the number of bytes in each section of this field. If no error occurred, @@ -1337,11 +1433,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x00 |1|P| Reserved (0) | + 0|0| 0x00 |1|P| Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / @@ -1353,11 +1449,11 @@ +---------------+---------------+---------------+---------------+ 48 2.4.1 P - poll bit - Request a NOP-In message + Request a NOP-In message from the target. @@ -1392,11 +1488,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x40 |1|P| Reserved (0) | + 0|0| 0x40 |1|P| Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / @@ -1410,11 +1506,11 @@ +---------------+---------------+---------------+---------------+ 48 2.5.1 P - poll bit - Request a NOP-Out message + Request a NOP-Out message from the initiator. @@ -1444,11 +1540,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x02 |1|0| Function | Reserved (0) | + 0|0| 0x02 |1|0| Function | Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Logical Unit Number (LUN) | + + @@ -1494,18 +1590,21 @@ For the functions above a SCSI Task Management Response MUST be returned, using the Initiator Task Tag to identify the operation for which it is responding. + For the <Clear Task Set> the target MUST send an Asynchronous Event to all other attached initiators to inform them that all pending tasks are cancelled and then enter the ACA state for any initiator for which it had pending tasks. + For the <Target Warm Reset> and <Target Cold Reset> functions, the target cancels all pending operations. The target MUST send an Asynchronous Event to all attached initiators notifying them that the target is being reset. +! For warm reset, should the target/LU send an AE once reset is complete? In addition, for the <Target Warm Reset> the target will enter the ACA state on all sessions and all LUs on which an AE was sent. In addition, for the <Target Cold Reset> the target then MUST @@ -1555,11 +1654,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x42 |1|0| Reserved (0) | + 0|0| 0x42 |1|0| Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Logical Unit Number (LUN) | + + @@ -1671,11 +1770,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x05 |1|0| Reserved (0) | + 0|0| 0x05 |1|0| Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| LUN or Reserved (0) | 12| | @@ -1721,11 +1820,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x45 |1|P| (0) |S|O|U| Reserved (0) | + 0|0| 0x45 |1|P| (0) |S|O|U| Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Reserved (0) | +---------------+---------------+---------------+---------------+ @@ -1788,12 +1887,12 @@ command completed with an error, then the response and sense data must be sent in a SCSI Response packet and must not be sent in a SCSI Data packet. b0-1 as in an ordinary SCSI Response - b2 S (status)- set to indicate that the Command Status field - contains status + b2 S (status)- set to indicate that the Command Status and + and iSCSI Status fields contain valid information b3-5 not used (should be set to 0) b6 P (poll) - set to indicate data acknowledgement is requested; b7 and b2 are mutually exclusive - if S bit is set P bit MUST be ignored @@ -1808,12 +1907,13 @@ specifying the next expected packet in a NOP command with the same Initiator Tag. Acknowledging NOP PDUs MAY be postponed for a maximum of 32 incoming data PDUs. An explicit request for acknowledgement made by setting the P bit MUST be honored. - +2.8.6 Command Status and iSCSI Status + See the description in SCSI Response (section 2.3). Satran Standards-Track, May 2001 34 @@ -1830,11 +1930,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x04 |1|0| Type | Reserved (0) | + 0|0| 0x04 |1|0| Type | Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / @@ -1853,14 +1953,18 @@ 48/ Text / +/ / +---------------+---------------+---------------+---------------+ 2.9.1 Type + + This field contains a value designating the context of the Text command. + Currently defined values are: 0 outside login phase 1 within login - +! What is this field used for anyway? Doesn't everyone know what phase +! they're in? 2.9.2 Length This is the length, in bytes, of the Text field. @@ -1880,33 +1984,37 @@ 2.9.4 Text The initiator sends the target a set of key:value or key:(list) pairs encoded in UTF-8 Unicode. The key and value are separated by a ':' (0x3A) delimiter. Many key:value pairs can be included in the Text - block by separating them with null ' ' (0x00) delimiters. Some basic - key:value pairs are described in Appendix A & C. The target responds + block by separating them with nul '\0' (0x00) delimiters. Some basic + key:value pairs are described in Appendices A & C. The target responds by sending its response back to the initiator. The target and initiator can then perform some advanced operations based on their common capabilities. Manufacturers may introduce new keys by prefixing them with their - (reversed) domain name, for example, + (reversed) domain name, for example the company owning the domain + acme.com can issue - com.foo.bar.do_something:0000000000000003 + com.acme.bar.foo.do_something:0000000000000003 Any key that the target does not understand may be ignored without affecting basic function. Once the target has processed all the key:value or key:(list) pairs, it responds with the Text Response command, listing the parameters that it supports. It is recommended that Text operations that will take a long time should be placed in their own Text command. If the Text Response does not contain a key that was requested, the initiator must assume that the key was not understood by the target. + Targets and initiators may limit the size of the text accepted in a text command and text response as well as the size of key:value pairs. Such limits should be indicated at login. + The default limit is 16384 UTF8 characters. +! Is this a hard limit, or made to match the common window size in TCP? @@ -1937,11 +2045,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x44 |1|0| Type | Reserved (0) | + 0|0| 0x44 |1|0| Type | Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / @@ -1963,12 +2071,11 @@ +/ / +---------------+---------------+---------------+---------------+ 2.10.1 Type - 0 outside login phase - 1 within login + This field is mirrored from the original Text command. 2.10.2 Length This is the length, in bytes, of the Text Response field. @@ -1989,11 +2096,11 @@ format as the Text Command. Appendix C lists some basic Text Commands and their Responses. If the Text Response does not contain a key that was requested, the initiator must assume that the key was not understood by the target or that the answer is <key>:none and the two MUST be equivalent where applicable. - +! I don't like this "none" thing. It's ugly. Blank (nul) would be better. @@ -2047,11 +2154,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x03 |1|0| Rsrvd (0) | Version-major | Version-minor | + 0|0| 0x03 |1|0| Rsrvd (0) | Version-major | Version-minor | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| CID | Reserved (0) | +---------------+---------------+---------------+---------------+ @@ -2070,19 +2177,21 @@ +/ / +---------------+---------------+---------------+---------------+ 2.11.1 Version-major - Currently 1. + Currently 0. 2.11.2 Version-minor - Currently 0. + Currently 3. +! It isn't a standard yet! 2.11.3 CID - A unique id for this connection within the session + A unique id for this connection within the session. +! Who decides this? What is it? Tell me something about it! 2.11.4 Initiator Task Tag Satran Standards-Track, May 2001 39 @@ -2090,10 +2199,11 @@ iSCSI November, 2000 This tag identifies all the commands and responses within the login sequence. +! Arghh! That's no help. 2.11.5 InitCmdRN Is significant only if TSID is zero and indicates the starting Command reference number for this session; it SHOULD be zero for all @@ -2154,11 +2264,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x43 |1|0| Rsrvd (0) | Version-major | Version-minor | + 0|0| 0x43 |1|0| Rsrvd (0) | Version-major | Version-minor | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / @@ -2205,10 +2315,12 @@ The Status returned in a Login Response is one of the following: 0 accept login (will now accept SCSI commands) 1 reject login +! What happened to "more authentication required" to allow the challenge/ +! reponse sequence to go through Text commands? In the case that the Status is "accept login" the initiator may proceed to issue SCSI commands. In the case that the Status is "reject login" the initiator should immediately close down its end of the TCP connection, thus freeing up the target's port for some other @@ -2261,11 +2373,12 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|1| 0x06 |1|P| Reserved (0) | + 0|0| 0x06 |1|P| Reserved (0) | +! There was a one above---assumed to be a mistake. Otherwise, explain. +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / @@ -2312,11 +2425,11 @@ close the connection and may try to establish a new connection. The NOP command with the P bit not set MAY be used to acknowledge data received from a target (data-ack). In this case, the command caries the same Initiator Task Tag as the data it acknowledges and - the CmdRN field MUST be zero. The field ExpStatRN/ ExpDataRN is then + the CmdRN field MUST be zero. The field ExpStatRN/ExpDataRN is then understood to be ExpDataRN. Repeated or obsolete data acknowledgements MUST be silently discarded by the target. @@ -2370,11 +2483,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x46 |1|0| Reserved (0) | + 0|0| 0x46 |1|0| Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / @@ -2424,11 +2537,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x07 |1|0| Function | Reserved (0) | + 0|0| 0x07 |1|0| Function | Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Reserved (0) | + + @@ -2448,18 +2561,16 @@ 48| Descriptor Type | Descriptor Length | +---------------+---------------+---------------+---------------+ 52/ Descriptor / +/ / +---------------+---------------+---------------+---------------+ - --------------------------------------------------------------------- - +---------------+---------------+---------------+---------------+ - | Descriptor Type | Descriptor Length | + +| Descriptor Type | Descriptor Length | +---------------+---------------+---------------+---------------+ - / Descriptor / + +/ Descriptor / +/ / +---------------+---------------+---------------+---------------+ - + + ... or @@ -2475,16 +2586,14 @@ +---------------+---------------+---------------+---------------+ 48| 8 byte Descriptor | +| | +---------------+---------------+---------------+---------------+ - --------------------------------------------------------------------- - +---------------+---------------+---------------+---------------+ - N | 8 byte Descriptor | + +| 8 byte Descriptor | +| | +---------------+---------------+---------------+---------------+ - + + ... The mapping command enables the initiator to map iSCSI specific addresses and access control information into formats compliant with the SCSI command standards (e.g., [SPC-2]). @@ -2496,11 +2605,11 @@ provide the 8 byte SCSI compliant address reference 0 Unmap - given a SCSI compliant address reference remove the mapping associated with it. Address/access control descriptors follow the header. For the map - function the following descriptor types are defined: + function the following Descriptor Types are defined: 0 Binary IP Version 4 TCP address (IP+Port) followed by a selector string; length should be 6+the selector length+1 1 Binary IP Version 6 TCP address (IP+Port) followed by a selector string; length should be 18+the selector length+1 @@ -2508,11 +2617,11 @@ selector followed by null) 3 FC address & port - in case access control is based on transport ID 4 access proxy token - Details for 3 & 4 have to be coordinated with T10 + Details for 3 & 4 have to be coordinated with T10. For the unmap function the descriptors are standard 8 byte SRAs (SCSI Reference Address) @@ -2530,11 +2639,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x47 |1|0| Reserved (0) | + 0|0| 0x47 |1|0| Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Reserved (0) | + + @@ -2553,11 +2662,17 @@ 36| Response | Reserved (0) | +---------------+---------------+---------------+---------------+ 40/ Reserved (0) / +/ / +---------------+---------------+---------------+---------------+ - 48 + 48| SRA 1 | + | | + +---------------+---------------+---------------+---------------+ + 56| SRA 2 (if specified) | + | | + +---------------+---------------+---------------+---------------+ + + ... 2.16.1 Entries Mapped The total number of mapped entries. @@ -2576,17 +2691,17 @@ iSCSI November, 2000 0 Function Complete - 1 Map Function Rejected - Bad Descriptors + 1 Map Function Rejected - bad descriptors 2 Map Function Rejected - too many descriptors - 3 Unmap Function Rejected - Bad Descriptor + 3 Unmap Function Rejected - bad descriptor If the Response to a map is function complete the data following the - header contains the SRAs to be used in third party commands; each SRA - matches a descriptor in the Map command. + header contains the 8-byte SRAs to be used in third party commands; each + SRA matches the corresponding descriptor in the Map command. Note that a map command can only entirely succeed (and then all descriptors are mapped or unmapped) or entirely fail. @@ -2638,16 +2753,17 @@ The logout command is used by an initiator to "clean-up" the target end of a failing connection and enable recovery to start. On sessions with a single connection, this might imply opening a second connection with the sole purpose of cleaning-up the first. +! Will this also close the TCP stream? Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x08 |1|0|Reserved (0) | + 0|1| 0x08 |1|0|Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| CID | Reserved (0) | +---------------+---------------+---------------+---------------+ @@ -2698,11 +2814,12 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|1| 0x48 |1|0| Reserved (0) | + 0|0| 0x48 |1|0| Reserved (0) | +! Was the one above deliberate? +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / / / @@ -2770,11 +2887,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x50 |1|0| Reserved (0) | + 0|0| 0x50 |1|0| Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Reserved (0) | + + @@ -2861,11 +2978,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x51 |1|0| Reserved (0) | + 0|0| 0x51 |1|0| Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8| Logical Unit Number (LUN) | + + @@ -2888,11 +3005,11 @@ 48/ Sense Data / +/ / +---------------+---------------+---------------+---------------+ -2.20.1 iSCSI Event +2.20.1 iSCSI Event Indicator Some Asynchronous Events are strictly related to iSCSI while others are related to SAM-2. The codes returned for iSCSI Asynchronous Events are: @@ -3022,11 +3139,11 @@ Byte / 0 | 1 | 2 | 3 | / | | | | |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +---------------+---------------+---------------+---------------+ - 0|F| 0x7f) |1|0| Reserved (0) | + 0|0| 0x7f) |1|0| Reserved (0) | +---------------+---------------+---------------+---------------+ 4| Length | +---------------+---------------+---------------+---------------+ 8/ Reserved (0) / +/ / @@ -3108,10 +3225,14 @@ command. This indicates the start of the authentication sequence. The command includes the protocol version supported by the target and the security parameters (not iSCSI parameters, those will be returned only after security is established to protect them) supported by the target. +! I dislike the idea of a Text Response to a Login request. I prefer the +! old method of "more authentication required" from an architectural point +! of view. (This method is effectively using Text Response as a Text +! Request, which is just wrong. B-) 3.2 Security negotiation The negotiation proceeds as follows: @@ -3131,11 +3252,11 @@ to the least. -The target MUST reply with the first option in the list it supports. The parameters are encoded in Unicode - UTF8 as key:value (e.g., the encryption option of triple-DES will appear as encryption:3des-cbc). The initiator MAY send - proprietary options as well. The ônoneö option MUST be included + proprietary options as well. The "none" option MUST be included in the list, indicating no algorithm supported by the target. If security is to be established, the initiator MUST NOT send parameters other than security parameters in the login command. The general parameters should be negotiated only after security is established at the desired level. Any operational @@ -3240,15 +3361,16 @@ For any outstanding SCSI command, it is assumed that iSCSI in conjunction with SCSI at the initiator is able to keep enough information to be able to rebuild the command PDU, that outgoing data is available (in host memory) for retransmission while the command is - outstanding. It is also assumed that at a target iSCSI and + outstanding. It is also assumed that, at a target, iSCSI and specialized TCP implementations are able to recover unacknowledged - data packets from a closing connection or, alternatively the target - has means to re-read data from a device server. It is further - assumed that a target will keep the "status & sense" for a command it + data packets from a closing connection or, alternatively, the target + has the means to re-read data from a device server. It is further + assumed that a target will keep the Status and Sense for a command it +! Bitwise AND would have been fun though. B-) has executed while the total number of outstanding commands and executed commands does not exceed its limit. A target will sequentially number the delivered responses and thus enable initiators to tell when a response is missing and which response is missing. @@ -3260,11 +3382,11 @@ for some devices and communications networks recovery may be complex and may percolate to upper software layers. It is assumed that targets and/or initiators will recognize a failing connection by either transport level means (TCP) or by a gap in the command or response stream that is not filled for a long time, or by a failing - iSCSI ping (the later should be used periodically by highly reliable + iSCSI ping/NOP (the later should be used periodically by highly reliable implementations). Initiators and targets SHOULD use the keep-alive option on the TCP connection to enable early link failure detection on idle links. The iSCSI recovery involves the following steps: @@ -3272,11 +3394,11 @@ -abort offending TCP connection(s) (target & initiator) and recover at target all unacknowledged read-data -issue a Logout command on a remaining connection or create a new connection and issue the Logout command -wait for the Logout response - -if needed create one or more new TCP connections (within the + -if needed, create one or more new TCP connections (within the same session) and associate all outstanding commands from the failed connection to the new connection at both initiator and target. @@ -3292,12 +3414,12 @@ are not acknowledged yet or a new CmdRN if they were acknowledged; the retry (X) flag in the command PDU will be set -upon receiving the new/retry commands the target will resume command execution; for write commands it means requesting data retransmission through R2T, for reads retransmitting recovered - data and for "terminated" commands retransmitting the status & - sense while retaining the original StatRN. If data recovery is + data and for "terminated" commands retransmitting the Status and + Sense while retaining the original StatRN. If data recovery is not possible, the target will either provide data from the media or redo the operation (if the operation is not idempotent the device server may fail the operation). @@ -3306,16 +3428,16 @@ The authors recognize that mapping framed messages over a "stream" connection (like TCP) makes the proposed mechanisms vulnerable to simple software framing errors and introducing framing mechanisms may be onerous for performance and bandwidth. Command reference numbers and the above mechanisms for connection drop and reestablishment will - help handle this type of mapping errors. + help detect and handle these types of mapping errors. 4.3 Session Errors If all the connections of a session fail and can't be reestablished - in a short time or if initiators detect protocol errors repeatedly an + in a short time or if initiators detect protocol errors repeatedly, an initiator may choose to terminate a session and establish a new session. It will terminate all outstanding requests with an iSCSI error indication before initiating a new session. A target that detects one of the above errors will take the following actions: @@ -3349,11 +3471,11 @@ mechanism was designed to enable RDMA at the iSCSI level or lower. 5.1 Small TCP Segments It is recommended that TCP segments be limited in size to no more - than 8K bytes. One reason we recommend small segments is to allow a + than 8192 bytes. One reason we recommend small segments is to allow a stronger type of checksum, possibly utilizing CRC, which is practical only for smaller segments. 5.2 Multiple Network Adapters @@ -3398,12 +3520,12 @@ 6. Security Considerations 6.1 Data Integrity - We assume that basic level end-to-end data integrity can be assured - by TCP, by using the standard checksum. For those applications for + We assume that basic level end-to-end data integrity can be reasonably + handled by TCP, by using the standard checksum. For those applications for which data integrity is of utmost importance iSCSI will provide an integrity option. 6.2 Network operations and the Threat Model @@ -3420,30 +3542,30 @@ Attacks fall into three main areas; passive, active, and denial of service. 6.2.1.1 Passive Attacks - In general, data transfers will be made through a switched fabric, + Often, data transfers will be made through a switched fabric, making sniffing difficult. In addition, the nature of the data (block transfers), even if sniffed, would not necessarily be readily understandable to the attacker. That being said, a determined attacker, by capturing of content and analyzing traffic over time, - could replicate enough of a drive to make the captured data + could replicate enough of a storage device to make the captured data meaningful. Certain storage operations which are mostly unidirectional, such as writing to a tape or reading from a CD-ROM, - are even more susceptible to passive attacks since the listener will + are more susceptible to passive attacks since the listener will be able to replicate most if not all of the operation. Passive attacks by traffic analysis alone is deemed out of scope since it is unlikely that the listener will be able to guess any pertinent information without knowing the content of the messages. It is also out of scope to detect passive attacks. The protocol must be able to prevent passive attacks by masking the contents of messages through some form of encryption. Finally, it is assumed that a strong authentication mechanism will be - necessary. Therefore, any long-lived passwords or private keys must + necessary. Therefore, any long-lived passwords or private keys should never be sent in the clear. Satran Standards-Track, May 2001 64 iSCSI November, 2000 @@ -3454,11 +3576,11 @@ Whereas passive attacks involve SNIFFING, active attacks will generally involve SPOOFING. If an attacker can successfully masquerade as a client, he will have total read/write access to those storage resources assigned to that client. Spoofing as a server is - more difficult, since many operations involve client reads of some + sometimes more difficult, since many operations involve client reads of some expected or otherwise understandable data. Most likely, many of the sessions will be long-lived. This feature has a dual effect of making these sessions more vulnerable to attack (hijacking TCP connections, cryptographic attacks), while at the same @@ -3475,11 +3597,11 @@ the effect of reducing performance, and as such can act as a denial of service. It is possible that an attacker can modify a message in such a way the session becomes uncoordinated, resulting in a tear down of the session. -6.2.2 Security Model +6.2.2 iSCSI Security Models 6.2.2.1 No Security This mode does not authenticate nor does it encrypt data. This mode should only be used in environments where there is minimal security @@ -3492,10 +3614,12 @@ errors. Once the client is authenticated, all messages are sent and received in the clear. This mode should only be used when there is minimal risk to man-in-the-middle attacks, eavesdropping, message insertion, deletion, and modification. For example, this mode can be used when IPsec is used in security gateways. + +! Digest? 6.2.2.3 iSCSI integrity and authentication Satran Standards-Track, May 2001 65 @@ -3517,10 +3641,11 @@ This mode provides for the end-to-end encryption (e.g. IPsec). In addition to authenticating the client, it provides end-to-end data integrity and protects against man-in-the-middle attacks, eavesdropping, message insertion, deletion, and modification. +! And SSL/TLS? A connection or multiple connections can be protected end-to-end by using IPSec. In this case, the initiator must use the "Implicit Authentication" parameter to indicate that IPSec should be used to specify the Access ID and perform authentication. @@ -3546,11 +3671,11 @@ encrypted passwords and trusted certificate authorities. Once the initiator and target are confident of the identity of the attached party, the established channel is considered secure. 6.4 Feasibility - +! Bad page split. Satran Standards-Track, May 2001 66 iSCSI November, 2000 @@ -3612,11 +3737,11 @@ 7. IANA Considerations There will be a well-known port for iSCSI connections. This well - known port is registered with IANA. + known port will be registered with IANA. @@ -3721,11 +3846,11 @@ iSCSI November, 2000 [TLS] The TLS Protocol, RFC 2246, T. Dierks et al. - +! RFC for IPv6 dotted notation. @@ -3792,11 +3917,11 @@ Daniel F. Smith IBM Almaden Research Center - 650 Harry Road + 650 Harry Road K65/C2 San Jose, CA 95120-6099, USA Phone: +1 408 927 2072 Email: dfsmith@almaden.ibm.com @@ -3991,11 +4116,11 @@ The following table details authentication methods. +-----------------------------------------------------------+ | Name | Description | - +! Bad page split. Satran Standards-Track, May 2001 74 iSCSI November, 2000 @@ -4004,10 +4129,12 @@ +-----------------------------------------------------------+ | password | Plain text user-password | +-----------------------------------------------------------+ | challenge | Challenge and response | +-----------------------------------------------------------+ + | kerberos5 | Defined by Kerberos version 5 protocol | + +-----------------------------------------------------------+ | none | No authentication | +-----------------------------------------------------------+ The following table details public key algorithms for authentication. @@ -4032,10 +4159,11 @@ For example, if ssh-dss is selected: public_key:ssh-dss,p,q,g,y Here the "p", "q", "g", and "y" parameters (encoded as numbers in +! In hexidecimal? In decimal? In MIME? In base 64+'A'? Unicode UTF8) form the signature key blob. Signing and verifying using this key format are done according to the Digital Signature Standard [FIPS-186] using the SHA-1 hash. A description can also be found in [Schneier]. @@ -4121,13 +4249,13 @@ init_auth:password I-> Text authenticate:alef,sesam If the authentication is successful: T->StartSecure:HERE ... - T-> Login ôlogin acceptö + T-> Login "login accept" If the authentication was not successful: - T-> Login ôlogin rejectö + T-> Login "login reject" Note - the Text command including SecureStart:HERE and each PDU after it will have the trailer consisting in a hmac-md5 digest for the header and a crc32 for each 2k of data (or fraction thereof). @@ -4157,12 +4285,12 @@ iSCSI November, 2000 If the user was not confirmed, the target sends a login - response message with ôlogin rejectö to the initiator. Else, - it can send a login response with ôlogin acceptö and MAY + response message with "login reject" to the initiator. Else, + it can send a login response with "login accept" and MAY attach a secret: T->Text StartSecure:HERE secret: I->Text ... parameters ...EndLogin:HERE T->Login (accept) ... parameters ... @@ -4184,11 +4312,11 @@ Note: the last packet should have the appropriate trailers. If the initiator was not confirmed, the target sends a login response - message with ôlogin rejectö to the initiator. Else, it can continue + message with "login reject" to the initiator. Else, it can continue with the login process: T-> Text authenticate:user,blob salt:532678925 In here, the target authenticates itself to the initiator. If the @@ -4234,15 +4362,15 @@ T-> Text challenge:question2 I-> Text authenticate:answer2 And at the end: - T-> Login ôlogin acceptö + T-> Login "login accept" If the authentication was not successful: - T-> Login ôlogin rejectö + T-> Login "login reject" Note - the Text command after authentication and each PDU thereafter will have in the trailer an hmac-md5 digest for the header and a crc32 for each 2k of data (or fraction of it). @@ -4384,18 +4512,19 @@ immediate data length requested, etc.. 08 MaxConnections MaxConnections:<number-from-1-to-65442> +! Okay... how was this number picked? Initiator and target negotiate the maximum number of connections requested/acceptable. 09 Target - Target:<domainname>[/modifier] + Target:<domainname>[:<port>][/modifier] Examples: Target:disk-array.sj-bldg-h.cisco.com Target:disk-array.sj-bldg-h.cisco.com/control7 @@ -4442,10 +4571,11 @@ is required, unless both the initiator and the target send this key- pair attribute specifying UseR2T:no. Once UseR2T has been set to 'no', it cannot be set back to 'yes'. Note than only the first outgoing data item (either immediate data or a separate PDU) can be sent unsolicited by a R2T. +! Could we use "1" and "0" here instead of "yes" and "no"? 12 BidiUseR2T BidiUseR2T:<yes|no> @@ -4486,27 +4616,27 @@ 14 ImmediateDataLength ImmediateDataLength:<number> Initiator and target negotiate the maximum length supported for - immediate data. Default is 4GB. + immediate data. Default is 2**32-1 bytes. 15 ITagLength - ITagLength:<number-from8-to-32> + ITagLength:<number-from-8-to-32> Initiator and target negotiate the significant length of the initiator tag to be used. Default is 32. 16 PingMaxReplyLength PingMaxReplyLength:<number> Initiator and target negotiate the maximum length of data contained - in a ping reply. Default is 4096. + in a ping reply. Default is 4096 bytes. 17 StartSecure StartSecure:HERE --End--
Home Last updated: Tue Sep 04 01:06:30 2001 6315 messages in chronological order |