[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Email character set conversion in procmail

Todd Zullinger wrote:
Hash: SHA1

Paul Howarth wrote:
Does anyone have a procmail recipe for email character set
conversion? I help to run an email newsletter, where people send in
contributions that get edited together into a single message that is
then sent out to all the list members. It would be nice if all of
the incoming messages were in the same character set (e.g. utf8,
iso-8859-1, whatever) from the point of view of pasting them into a
single document, but I haven't been able to find a recipe for doing
this automatically. Any suggestions?

Something piping the mail through iconv or recode is what I'd think
you want.  Here's one I found via google and tweaked a little:

# convert utf to latin (if the subject is translate me)
* ^Subject: translate me$
* ^Content-Type: text/(plain|html); .*charset=.?utf-8
    :0 fbw
        |iconv -f UTF-8 -t ISO-8859-1//TRANSLIT

    :0 fhw
        * ^Content-Type: text/plain
        |formail -c -i "Content-Type: text/plain; charset=ISO-8859-1"

    :0 Efhw
        * ^Content-Type: text/html
        |formail -c -i "Content-Type: text/html; charset=ISO-8859-1"

Maybe that'll get you started on a good solution.  Or maybe it will
inspire someone that knows much better than I to post a better
solution. :)

It's probably better to go in the other direction, from latin to utf,
as there are bound to be characters in utf that can't get converted to
latin.  But I'm not a charset guru so it's all guesswork for me.

I'll have a play with this and see how far I get. However, it's not going to help with multipart/alternative mail so I'd need something MIME-aware for that. Anyone got any suggestions?


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]