International Chars cause backup problems - FYI

Anne Wilson cannewilson at tiscali.co.uk
Mon Mar 13 22:45:43 UTC 2006


On Monday 13 March 2006 22:36, Joel Rees wrote:
> On 2006.3.14, at 02:24 AM, Anne Wilson wrote:
> > On Monday 13 March 2006 15:01, Anne Wilson wrote:
> >> I have a number of files on my server that have international
> >> characters in
> >> their names.  My backups are constantly failing, with errors like
> >> this:
> >>
> >> Incorrectly encoded string (Soy Loco por Ti, América) encountered.
> >> Possibly creating an invalid Joliet extension. Aborting.
> >>
> >> I could continue trying to find every instance of these characters,
> >> but
> >> it's very inefficient.  It seems to me that I have only ever used the
> >> character sets utf8, 8859-1 and 8859-15, yet setting kde to use any
> >> of them
> >> results in titles with spaces instead of the international
> >> characters.  I
> >> presume that only file labels are causing problem - there will be
> >> strings
> >> in text documents as well.
> >>
> >> I've installed convmv, which is supposed to deal with converting
> >> character
> >> sets, but it seems to me that the biggest problem is knowing which
> >> files
> >> need conversion.  I don't particularly want it to go through the whole
> >> drive converting everything unnecessarily.
> >>
> >> Any ideas how I can ascertain which character sets were used in naming
> >> these files, and how to list all files using that encoding?  Or any
> >> other
> >> way of tackling the problem?
> >
> > I didn't find any way to list the affected files, so I used convmv
> > against the
> > folders most likely to hold affected files.  Backup completed and
> > verified,
> > so it had nothing to do with k3b or hardware problems - just character
> > sets.
> > FWIW, I told it to convert to utf8.
>
> Other than not using Joliet (the most correct answer IIUC) that's about
> what you should do, But, just to check, you do make the conversion
> _before_ the file gets copied, right?

That's right.  The command can be run recursively on a directory, and any 
files with characters that are not utf8 get renamed.  After that, backup is 
easy.

Anne
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/fedora-list/attachments/20060313/35896371/attachment-0001.sig>


More information about the fedora-list mailing list