[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Back from locale hell ...

Matt Wilson wrote:
> http://anubis.dkuug.dk/JTC1/SC22/WG20/ is a good place to start.
> http://anubis.dkuug.dk/JTC1/SC22/WG20/docs/14652fcd.txt has a bit of
> commandline rationale -- I'm not sure if it is applicable.
> 14651 is for sorting.  It's a good read...

I couldn't find any specification in 14651 or 14652 of the expected 
behavior if LC_COLLATE was not set in the environment.   Did I miss it?

I did find this in 14652:

	The following is a rephrasing of rules defined for "lexical ordering in
	English and French" by the Canadian Standards Association
	(text is brackets is rephrased):

	(1)     Once special characters (punctuation) have been removed
	        from original strings, the ordering is determined by
	        scanning forward (left to right) [disregarding case and
	(2)   .... 

And later:

	punct   Define characters to be classified as punctuation
        	characters. No character specified for the keywords
        	upper, lower, alpha, digit, cntrl, xdigit, or as the
        	<space> character shall be specified. The keyword shall
       		be specified.	

Notice that <space> is not considered punctuation and should therefore *not*
be removed before sorting.   So perhaps the current defuult behaviour
is actually a real algorithmic bug as well?

John Ellson (ellson lucent com)  Lucent Technologies, Holmdel, NJ, 07733

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]