[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

What does Red Hat say on Perl/UTF-8 problems?


Please try the short script attached, coming straight from the Perl Cookbook ftp
page, at www.ora.com . It gives a tree view of the output of the du
command. On a standard xterm in RH8 it also gives a bunch of these

Malformed UTF-8 character (unexpected end of string) at
./SOFT_TMP/cookbook.examples/ch05/dutree line 17, <> line 773.
Malformed UTF-8 character (unexpected end of string) at
./SOFT_TMP/cookbook.examples/ch05/dutree line 19, <> line 773.

The errors disappear when one "export LANG=en_US", kind of like the
man page issue. The script mb2md, converting mailboxes to Maildir
format, shows an almost identical problem (download it at
 http://batleth.sapienti-sat.org/projects/mb2md/ )

After these two simple test cases, the actual question: 

Red Hat did a Very Good Thing moving to Unicode/UTF-8 defaults in
psyche. Somebody had to start the mass migration, and I am glad that
they did. In spite of this, it cannot be denied that {old, 3rd party}
Perl scripts, working on {old, randomly encoded} text files break
when run in the default Perl/shell environment of psyche.

Is there a Red Hat page specifying:

	what must be changed in scripts so that one does NOT need to
	alter variables  before and after perl things, maybe

	what kind of shell wrappers one must use when there is no
	possible solution of the kind above

If there is no such page, why not?

Keep in mind that I'm perfectly aware that it is not only Red Hat
responsibility. I am posting almost the same message to the Perl/UTF-8
list. What I'm asking to Red Hat is something that clearly says:

	"with our default environment and Perl packages, do this, this
	and this to make your scripts working again"

	"This and that specific behaviors are bugs in Perl, and we
	have to wait that they fix it"

	"This and that specific behaviors mean that the *script* is
	hopelessly broken, and should be rewritten (ditto for specific
	perl modules"

Any feedback is welcome!

		Marco Fioretti
Marco Fioretti                 m.fioretti, at the server inwind.it
Red Hat for low memory         http://www.rule-project.org/en/

The three most dangerous things are a programmer with a soldering
iron, a manager who codes, and a user who gets ideas.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]