tidy bug

Tim ignored_mailbox at yahoo.com.au
Mon Jun 4 11:19:28 UTC 2007


On Mon, 2007-06-04 at 01:07 -0600, Frank Cox wrote:
> I have found a bug in tidy-0.99.0-12.20070228.fc7.
> 
> However, tidy and libtidy are not listed as a component on
> bugzilla.redhat.com, even though the tidy packager is listed as "Fedora Project
> <http://bugzilla.redhat.com/bugzilla>" in the rpm.
> 
> So, what's my next step?
> 
> (The bug turns a 328-line html file into a 32110-line monster, consisting
> mostly of font directives.  The same file turns into 1433 lines using tidy on
> Fedora Core 6.)

I'm not too surprised that you get a monster file out of it, you're
feeding it broken HTML in the first place.  HTML tidy "tidies" HTML code
(reformats it in a neat way), it's not a fix-up-broken-HTML tool.  HTML
tidy could, possibly, be something to fix that particular error, but the
real fault is elsewhere.

Copying a simplified example line from your source:
<P><FONT><FONT><U>stuff</U> <U>stuff</U></FONT></P>

Notice there's two opening font tags, but only one closing one.  There's
a huge number of lines, like that.  To attempt to fix it, it's either
got to put in an extra closing font tag, or merge together the two
opening ones.  But that's really a job that doesn't belong to HTML tidy
to have to sort out, whatever generated the HTML in the first place
needs fixing.

You'd be better off making one CSS rule applied to paragraphs on the
page, anyway.  Rather that a gazillion font elements.  For example:

<style type="text/css">
p {margin-bottom: 0;
   font-family: Times New Roman;
   font-size: 12pt;}
</style>

<p>advert text</p>
<p>advert text</p>
<p>advert text</p>

-- 
(This box runs FC6, my others run FC4 & FC5, in case that's
 important to the thread.)

Don't send private replies to my address, the mailbox is ignored.
I read messages from the public lists.





More information about the fedora-list mailing list