[publican-list] sortable lists, esp. glossaries

Mon Jan 30 06:28:58 UTC 2012

Hi Fred, welcome to the list.

We have discussed this many times, see: 
https://fedorahosted.org/publican/wiki/WishList#GlossaryCollation

Collating the three kana scripts of Japanese properly is the Mt Everest 
of this challenge.

On 01/27/2012 08:24 AM, Fred Dalrymple wrote:
> Hi everyone --
>
> Just joined this list, though I've been at Red Hat for a couple of years. I'm a marketing writer in Westford, currently writing the CloudForms Evaluation Guide, and have started using topic tools and publican.
>
> I understand that we've generally avoided alphabetically sorted lists, e.g., glossaries, in documents because the sort order may change when translated.

It's not because the order changes, it's because it's not possible to 
collate kana correctly.

> I'm thinking of prototyping a solution that would transform a list into an appropriate alternate order and am interested in any requirements that people might have. For example:
>
>      • I'm assuming it should work with DocBook 4.5 (at least, that's what I'm using now) -- any other versions or tagsets?

Really it should just take two strings and return the correctly collated 
"bigger than" value. Leave it up to the app to supply the strings and 
handle what it wants to pass in.

>      •
> what lists, other than glossaries, should be handled by this solution? An obvious possibility is<orderedlist>.

Ignore that level to start with, just handle two strings.

>
>      •
> is there anything wrong with the general idea of deriving an alternate version of a source file, to reorder designated lists, as long as the original source files are not harmed? I'm leaning toward doing the processing just before formatting so that it would be more independent of edits.

It limits how it can be used. XSL can call arbitrary functions so it can 
be handled during the XSLT call, no need for an extra intermediate file.

>
>      •
> are there any constraints on which language the pre-processor is written in? For example, perl is ok?

There is no constraint. Perl would be great, but kana collation would be 
extremely useful to many open source projects, so picking a language 
that performs well and can be easily made in to a library would be good.

> How are translations handled? Files are sent out, translated, and returned with markup generally intact (i.e., only content translated)? I see the directory structure in the document tree for each locale.

I wouldn't worry about that. Collating kana is an enormous challenge and 
you are better off leaving the app specific integration up to app 
developers. I'd have it working in publican in a split second if it 
existed as a library.

> A requirement that I'm planning to observe: this approach should not require changes to any content source files. However, new, independent, non-content files could be used to drive the process. The only exception that I can see now is, for example, an<orderedlist>  might indicate that it is supposed to be alphabetized by using something like role="alphabetized".

Again, forget application use, concentrate on getting the collation working.

Good luck!

-- 
"Reply All" why you shouldn't use it: 
http://www.emailreplies.com/#12replytoall