rpmlint file-not-utf8

Nicolas Mailhot nicolas.mailhot at laposte.net
Fri Sep 14 08:20:12 UTC 2007


Le Ven 14 septembre 2007 00:26, David Woodhouse a écrit :
> On Thu, 2007-09-13 at 18:35 +0100, José Matos wrote:
>>   Not only that but I remember to see html pages composed with
>> latin1
>> and without the charset in metadata. So the warning has its uses.
>> :-)
>
> Well... doesn't HTTP default to ISO8859-1 unless the charset is
> otherwise specified?


HTTP yes but HTML no

-> see http://www.w3.org/TR/html4/charset.html

« The HTTP protocol ([RFC2616], section 3.7.1) mentions ISO-8859-1 as
a default character encoding when the "charset" parameter is absent
from the "Content-Type" header field. In practice, this recommendation
has proved useless because some servers don't allow a "charset"
parameter to be sent, and others may not be configured to send the
parameter. Therefore, user agents must not assume any default value
for the "charset" parameter. »

Also:

1. A lot of pages are not ISO8859-1 but ISO8859-15 or the windows
latin variant, so *never* assume just because there is no charset
declaration it's valid  ISO8859-1

2. Default encoding is user-settable at the browser level and users do
change the US-friendly ISO8859-1 default so any page without charset
declaration will render wrongly on some systems

3. Local HTML pages are read without passing through HTTP so HTTP
defaults do not apply

So any HTML page without charset definition should be treated as a bug
(unless it's in a webapp which Apache config file forces a particular
encoding, or it's a xhtml page with encoding specified at the XML
level)

-- 
Nicolas Mailhot





More information about the fedora-devel-list mailing list