Size optimization of a HTML document

Welty, Richard richard.welty at bankofamerica.com
Tue Nov 29 16:53:21 UTC 2005


Paul Smith wrote:
>Thanks to all. HTML Tidy seems to be the right tool. However, the HTML
>code of my document is not valid (602 errors are found). Is there some
>automatic way of getting all those errors repaired?

that's really hard because any such tool would have to guess about what's
wrong.

however, i suggest that it's not as bad as it seems; these sorts of errors
tend to snowball. start at the top, fix a couple and revalidate. occasionally
you'll see something that can be fixed with a global replace, a simple sed
script, or the like.

"602 errors" can easily be caused by an order of magnitude fewer actual
problems.

also, what are you validating to? i'd suggest that if you're doing this
sort of cleanup, you should look at html 4.01 or xhtml 1.0 as your
target. writing strict xhtml 1.0 that works in older browsers takes a
little care, but it's not outrageously difficult.

richard




More information about the fedora-list mailing list