tr problem

Cameron Simpson cs at zip.com.au
Sat Jun 14 09:24:48 UTC 2008


On 13Jun2008 22:42, Gene Heskett <gene.heskett at verizon.net> wrote:
| Before this gets too far off the track, I wanted to replace the single $0D with 
| a single $0A.  And yes, I'm aware that dos used both characters, which I hate 
| to admit is the actual right way to do it.

You're right to hate it, because it is the wrong way to do it.
Using two characters as a line delimiter is mad, because it introduces
parsing ambiguity (what should a bare \r or \n mean? What about \n\r?
etc).

The two character delimiter is a holdover from naively coded systems that
wanted to dump text files direct to printers without any translation, so a
carriage return and a line feed were needed to manipulate the printer head.
Which is just daft in a data storage format.

That the IETF internet text protocols use CR+NL as line delimiters is
a compatibility thing, not a recommendation.

A line _should_ be terminated by a single character. What that character
is is a somewhat arbitrary choice, given that the ASCII table doesn't
have an end-of-line (EOL) character, just CR and LF and ASCII was what
was there the play with. UNIX went with NL, OS/9 and Macs went with CR,
and DOS went with "I'm too dumb to translate text delimiters into
printer control actions", thus its CR/NL overspeak.

Cheers,
-- 
Cameron Simpson <cs at zip.com.au> DoD#743
http://www.cskk.ezoshosting.com/cs/

Microsoft: Where do you want to go today?
UNIX:      Been there, done that!




More information about the fedora-list mailing list