hexdump doing funny things with utf-8 bom?

Joel rees at ddcom.co.jp
Wed Mar 9 08:09:29 UTC 2005


I have a UTF-8 file with a bom. bash tells me it can't execute the thing
because it's binary. That's okay, I'll get rid of the bom.

But, while I was checking for the bom with hexdump, I discovered that
hexdump will clip off the lead octet of 0xfe in default and hexadecimal.
Canonical and character display correctly.

Mixing hexadecimal with character (hexdump -cx) gets a really odd output
-- you can clearly see the hexadecimal is off by one, and a final
garbage character is displayed by the hexadecimal dump to keep the
length even.

I'm going to post a bug tonight (if I remember), but I was wondering if
anyone else has seen this.

--
Joel Rees   <rees at ddcom.co.jp>
digitcom, inc.   株式会社デジコム
Kobe, Japan   +81-78-672-8800
** <http://www.ddcom.co.jp> **




More information about the fedora-list mailing list