Convert unwrapped paragraphs to hard wrapped paragraphs whenthere's no blank lines.

Linux for blind general discussion blinux-list at redhat.com
Sat Mar 28 07:39:33 UTC 2020


Hi Paul,

On Fri, 27 Mar 2020 14:43:01 -0700
Linux for blind general discussion <blinux-list at redhat.com> wrote:

> > I don't understand how paragraphs start and end in these files. Otherwise
> > you
> > can try using one of the text processing tools mentioned here:
> >
> > * https://www.shlomifish.org/open-source/resources/text-processing-tools/
> >
> > * https://www.computerhope.com/unix/ufold.htm
> >
> > * https://en.wikipedia.org/wiki/Fmt_(Unix)
> >
> > * https://en.wikipedia.org/wiki/Par_(command)
> >
> > Note that you may have better luck converting EPUBs (assuming they lack
> > https://en.wikipedia.org/wiki/Digital_rights_management ) to plaintext using
> > tools such as https://pandoc.org/ ,
> > https://metacpan.org/search?q=html%3A%3Awikiconverter&size=20 , etc.  
> 
> Of that list of programs, I'd be inclined to use Pandoc. It permits
> you to write filters in (embedded) Lua, which is a quick-to-learn
> programming language. For example, this Lua one-liner converts a
> string ("s") to add a line break after each existing line break:
> 
> s = string.gsub(s, "<BR>", "<BR>\n<BR>")
> 

Other tools may work as well. Furthermore, your HTML processing substitution
will not work if one has "<br>" or "<br />" or "<br/>" for newlines or uses the
more recommended https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p
element.

Also see:

* https://perl-begin.org/uses/text-parsing/

* https://blog.codinghorror.com/parsing-html-the-cthulhu-way/



> On writing Pandoc filters with Lua, see <https://pandoc.org/lua-filters.html>.
> 
> Best regards,
> 
> Paul
> 



-- 

Shlomi Fish       https://www.shlomifish.org/
https://is.gd/MQHVF3 - The Atom Text Editor edits a 2,000,001B file

Joel’s Generalisation: If it happens to you, it happens to everybody.
(Or: It’s never only you.)
    — Based on http://www.joelonsoftware.com/news/20020402.html

Please reply to list if it's a mailing list post - http://shlom.in/reply .





More information about the Blinux-list mailing list