|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Internationalisation issueMark, I think that Glen's proposal makes sense. Names in the SCSI space are not manually manipulated too often and mistakes are bound to be corrected fast. You would not believe this from a person using a language that hase no case :-) Julo Mark Bakke <mbakke@cisco.com> on 26-05-2001 00:17:31 Please respond to Mark Bakke <mbakke@cisco.com> To: Glen Turner <glen.turner@aarnet.edu.au> cc: ips@ece.cmu.edu Subject: Re: Internationalisation issue Glen- The IESG wouldn't let us specify an iSCSI name as a URN, so we are just going to let this draft expire. Anyway, the case-insentive rule still applies, and is specified in the naming & discovery document. Basically, an iSCSI name is treated as an opaque string; the only valid operation on an iSCSI name (other than display) is comparison. We had to specify a method to compare iSCSI names. Since they are text strings, and since text strings are transcribable, it is less problematic for users if they are case-insensitive. However, you are correct that once we have gone from ASCII to UTF-8, comparison is not as simple. On my linux system, it looks like I have some wide character toupper() and tolower() routines which would probably do the trick after converting the UTF-8 into 32-bit Unicode, but I haven't looked at the code that goes with them or its complexity. Our possible solutions are: - Make iSCSI names case-sensitive This is easy, but not as nice for the user, especially when manual configuration is performed. - Make iSCSI names case-insensitive for all character sets. This is probably the ideal from a user's point of view. I would be interested in comments as to how much this is needed (it's what is currently specified) or even if it is needed, who plans to support it. This is the hardest one to implement. Some character sets have three cases (lower, upper and title) as well. Jim H had written up some stuff on how this works in the current Naming & Discovery document. I guess the question here is whether it's worth going to the trouble to give the user the ultimate solution. The comparison method would be: 1. If the high bit is not set on any of the characters in one of the iSCSI names to be compared, just do a strcasecmp(). 2. Convert the iSCSI names from UTF-8 to 32-bit Unicode. 3. Use wcscasecmp() to compare the names. The above assumes the GNU/Linux library calls. Again, I haven't looked at how much code this is, but it's freely available and would work. - Make iSCSI names case-insensitive within the ASCII character range, and case-senstive for everything else. This means that non-ASCII characters have to be made case- sensitive, which is fine for many users, but may not be as nice for those using other character sets. It would be easy to implement, since normal toupper() and tolower() functions can be used. - Make iSCSI names case-sensitive, but mandate that they are all transmitted and stored in lower case. This makes it the user interface's problem to do the tolower() on whatever character sets it supports. It's easy for the iSCSI implementation. However, an iSCSI name displayed on "the other side" of a connection would lose any case typed by the user. This might be the best compromise. - Make iSCSI names ASCII This eliminates many character sets, and will restrict methods of producing unique names in various languages. It would also not handle the multilingual DNS names when they happen. However, since we do have an Alias that is a UTF-8 string and doesn't have to be compared, perhaps nobody plans to generate an iSCSI name based on non-ASCII characters? Any thoughts? I would be interested in the following opinions from the list: 1. Do you or your customers see value in using non-ASCII character sets to generate iSCSI names, or is the use of Alias enough? 2. Do you or your customers see value in preserving the original case in which an iSCSI name was generated, or would it be fine just to specify that they are always transferred in lower case? 3. Is the procedure for comparing these names too much to expect from a small target? -- Mark Glen Turner wrote: > > > Rules for Lexical Equivalence: > > > > The entire URN is case-insensitive. > > Why? This implies that devices now need to carry casing tables > tables for multi-lingual support. > > If the field is case sensitive then adding multilingual support > is simply a matter of specifiying UTF-8 and no ongoing work is > needed to track multilingual DNS. > > Regards, > Glen -- Mark A. Bakke Cisco Systems mbakke@cisco.com 763.398.1054
Home Last updated: Tue Sep 04 01:04:36 2001 6315 messages in chronological order |