Using sed to straighten quotes.

Janina Sajka janina at rednote.net
Mon Apr 17 16:40:09 UTC 2017


I've had pretty good success with pandoc to the extent that I now have
it auto converting mail attachments via my mailcap.

Janina

Tim Chase writes:
> While it *can* be done in sed, the solution requires visiting every
> character and manually noting a transliteration, as well as doing
> those changes for every input encoding that you can 
> 
> You should have the `iconv` package on your system which will let you
> specify the input/output encodings as well as force those characters
> to be transliterated:
> 
>  iconv -f utf8 -t ascii//TRANSLIT input.txt > output.txt
> 
> which converts from (-f) UTF-8 encoding to transliterated ASCII.  If
> your input is some other Windows code-page or other source, you can
> change the "from" encoding to something other than "utf8" such as
> "cp1252".
> 
> -tim
> 
> 
> On April 17, 2017, Jeffery Mewtamer wrote:
> > Things like left and right double curly quotes and single right
> > curly quotes are the most common offenders, which also screws up my
> > screen reader's pronunciation of contractions and possessive,
> > though things like ellipsis, em-dashes, and accented letters also
> > cause problems.
> > 
> > Most of these problems can be fixed manually, though it means I
> > often spend as much time correcting the file as I do reading it.
> > 
> > I know how to use sed to do global search and replace on plain text
> > files, at least where both the string to be found and the string
> > it's to be replaced with can be typed, but most of the replacements
> > I'd like to make have search strings containing characters not on my
> > keyboard.
> > 
> > So, how do I tell sed to replace a left double curly quote with a
> > straight double quote, an ellipsis with three periods, or an e with
> > an acute accent with a normal e among other such things? And if
> > this is beyond sed's capabilities, could someone suggest another
> > command line tool that can automate this task?
> > 
> > -- 
> > Sincerely,
> > 
> > Jeffery Wright
> > Bachelor of Computer Science
> > President Emeritus, Nu Nu Chapter, Phi Theta Kappa.
> > Former Secretary, Student Government Association, College of the
> > Albemarle.
> > 
> > _______________________________________________
> > Blinux-list mailing list
> > Blinux-list at redhat.com
> > https://www.redhat.com/mailman/listinfo/blinux-list
> 
> _______________________________________________
> Blinux-list mailing list
> Blinux-list at redhat.com
> https://www.redhat.com/mailman/listinfo/blinux-list

-- 

Janina Sajka,	Phone:	+1.443.300.2200
			sip:janina at asterisk.rednote.net
		Email:	janina at rednote.net

Linux Foundation Fellow
Executive Chair, Accessibility Workgroup:	http://a11y.org

The World Wide Web Consortium (W3C), Web Accessibility Initiative (WAI)
Chair, Accessible Platform Architectures	http://www.w3.org/wai/apa




More information about the Blinux-list mailing list