[publican-list] 1.99: translation(?) to SA drops </ulink> end tag

Jeff Fearn jfearn at redhat.com
Tue Jun 1 23:07:11 UTC 2010


On Tue, 2010-06-01 at 02:33 -0400, Robert P. J. Day wrote:
> On Tue, 1 Jun 2010, Jeff Fearn wrote:
> 
> > On Mon, 2010-05-31 at 06:08 -0400, Robert P. J. Day wrote:
> > > ok, i'm not sure what the deal is here.  here's the original XML
> > > file datadir/Common_Content/common/en-US/Conventions.xml:
> > >
> > > ... In PDF and paper editions, this manual uses typefaces drawn from
> > > the <ulink url="https://fedorahosted.org/liberation-fonts/">Liberation
> > > Fonts</ulink> set. ...
> > >
> > >   and here's the cause of a ./Build error in file
> > > datadir/Common_Content/common/tmp/ar-SA/xml_tmp/Conventions.xml:
> > >
> > > ... In PDF and paper editions, this manual uses typefaces drawn from
> > > the <ulink url="https://fedorahosted.org/liberation-fonts/">Liberation
> > > Fonts set. ...
> > >
> > >   is there a reason that </ulink> end tag has disappeared?
> >
> > There are a couple of patches required to upstream modules:
> > https://fedorahosted.org/publican/#Patchesforrequiredpackages
> >
> > This sounds like the HTML::TreeBuilder bug.
> 
>   hmmmmm ... speaking from a position of massive ignorance, i'm not
> quite convinced.  the diagnostic specifically identifies the error
> with:
> 
>   at /usr/lib/perl5/XML/Parser.pm line 187
> 
> so wouldn't that represent XML parsing, not HTML?

This is due to inheritance, XML::Element is a sub class of
HTML::Element. The input it processed via XML::Element, which picks up
as_XML from HTML::Element, which has the bug of dropping trailing tags
in optionally empty tags. This output is then fed to XML::Parser which
barfs due to the missing close tag.

>   but there's
> something else i just noticed.
> 
>   unlike with the XML treebuilder package, there doesn't appear to be
> a corresponding ubuntu HTML package -- here's the entire list of
> "treebuilder" packages that exist:

That is a typo on the wiki, it should be HTML::Tree, I fixed that.

> $ apt-cache search treebuilder
> libhtml-scrubber-perl - Perl extension for scrubbing/sanitizing html
> libhtml-treebuilder-xpath-perl - Perl module to add XPath support to HTML::TreeBuilder
> libwww-mechanize-treebuilder-perl - Perl module integrating WWW::Mechanize and HTML::TreeBuilder
> libxml-handler-trees-perl - Perl module for building tree structures using PerlSAX handlers
> libxml-treebuilder-perl - XML parser providing XML::Elements DOM similar to HTML::Element
> $
> 
>   you can see that one for xml, i have that installed.  but there is
> no directly corresponding one for HTML.  however, if you notice that
> third package in the list, it clearly claims to integrate
> HTML::TreeBuilder functionality, and i *don't* have that installed --
> nothing in the build process suggested i needed it.
> 
>   maybe this is all wildly irrelevant, i'm just pointing out what i'm
> seeing -- no actual, specific libhtml-treebuilder-perl package on
> ubuntu.
> 

Yeah, my bad :(

Cheers, Jeff.

-- 
Jeff Fearn <jfearn at redhat.com>
Software Engineer
Engineering Operations
Red Hat, Inc
Freedom ... courage ... Commitment ... ACCOUNTABILITY

Sure our competitors can rebuild the source but can they engage the customer the same way? -wmealing




More information about the publican-list mailing list