[publican-list] Machine-readable meta-data in docbook

Jeffrey Fearn jfearn at redhat.com
Wed Aug 25 02:53:38 UTC 2010


Joshua Wulf wrote:
>  On 08/25/2010 11:46 AM, Jeffrey Fearn wrote:
>> Joshua Wulf wrote:
>>>   Is there a good way to include machine readable meta-data in a 
>>> docbook file for use with publican?
>>>
>>> I'm thinking things like Author, Created Date, Modified Date, 
>>
>> These examples exist in revhistory, which can be contained in the 
>> various *info tags and at some other block levels. e.g chapterinfo, 
>> sectioninfo, appendixinfo, para, etc.
>>
>> I'm pretty sure they don't get rendered if they are in these tags, but 
>> if they are we could switch that off easily enough.
>>
>> You could have a single revision to track current status if you didn't 
>> want the entire history.
>>
>>> Validated, QE Flag, etc...
>>
>> IMHO these are examples of information best kept in a work flow system 
>> not in the XML. When I d/l and modify an XML it's no longer Validated 
>> and the QE flag is incorrect. It's trivial to get out of sync and have 
>> an incorrect perception of where you are at in the whole 
>> write/review/publish process.
>>
> I agree. If we did this longer term we'd migrate the meta-data out to a 
> dedicated container. Just looking for something to get a prototype up 
> and running at this point.
> 
>> Even if you are going to keep it in the XML, since it's not being 
>> displayed and only being accessed for machine processing you could 
>> easily use existing attributes to cover this.
>>
>> e.g. you could add Verified or QE'd in to the revision remark.
>>
>> e.g. you could set the condition attribute in the revision to 
>> condition="Verified".
> 
> 
>>
>>> Information that is useful to have, but should not be displayed in 
>>> the document output.
>>
>>
>>> I looked at including using RDFa to do it [1], but it looks like that 
>>> relies on Docbook 5.
>>
>> AIUI RDFa won't work in DocBook 4, it injects attributes from foreign 
>> name spaces in to existing tags, this requires support in the DTD to 
>> be able to run validation.
>>
>> Also that article ends with "So I'm not sure." so you'd need to 
>> confirm that it will actually be in DocBook 5.
>>
>>> Any suggestions for Docbook 4?
>>
>> Try the revhistory, it should cover your needs.
>>
> The root element of the docbook files I need to annotate are 
> <variablelist>, <procedure>, <table> and possibly some other at a 
> similar level. Anything down there that might be useable?

If it's just to check automated access to metadata at semi-random 
places, why not use a remark that contains a key=value formatted string 
and role to identify it's use? Remark has a pretty wide range of places 
it can appear.

e.g. <remark role="metadata">author="Dave Dude" verified=false 
created=20100127</remark>

This way in debug mode you'd get to see where these things are set and 
what their values are.

If you need even wider usage an alternative would be to use a similar 
key=value system, but use a common attribute that isn't currently used, 
Security, UserLevel and Vendor stand out. These can be used in any tag, 
but you lose the ability to easily see the values in the output and you 
are more constrained in what content you can put in an attribute.

e.g. <table vendor="author='Dave Dude' verified=false">

It's not as tidy as using tags, but string manipulation is not a great 
barrier and it works around a few harder problems that you probably 
don't need to solve atm. If you are using machine generation to set and 
get the content you can just use string serialization to pack and unpack 
the content.

Cheers, Jeff.

-- 
Jeff Fearn <jfearn at redhat.com>
Software Engineer
Engineering Operations
Red Hat, Inc
Freedom ... courage ... Commitment ... ACCOUNTABILITY




More information about the publican-list mailing list