Convert unwrapped paragraphs to hard wrapped paragraphs whenthere's no blank lines.
Linux for blind general discussion
blinux-list at redhat.com
Sat Mar 28 07:39:33 UTC 2020
Hi Paul,
On Fri, 27 Mar 2020 14:43:01 -0700
Linux for blind general discussion <blinux-list at redhat.com> wrote:
> > I don't understand how paragraphs start and end in these files. Otherwise
> > you
> > can try using one of the text processing tools mentioned here:
> >
> > * https://www.shlomifish.org/open-source/resources/text-processing-tools/
> >
> > * https://www.computerhope.com/unix/ufold.htm
> >
> > * https://en.wikipedia.org/wiki/Fmt_(Unix)
> >
> > * https://en.wikipedia.org/wiki/Par_(command)
> >
> > Note that you may have better luck converting EPUBs (assuming they lack
> > https://en.wikipedia.org/wiki/Digital_rights_management ) to plaintext using
> > tools such as https://pandoc.org/ ,
> > https://metacpan.org/search?q=html%3A%3Awikiconverter&size=20 , etc.
>
> Of that list of programs, I'd be inclined to use Pandoc. It permits
> you to write filters in (embedded) Lua, which is a quick-to-learn
> programming language. For example, this Lua one-liner converts a
> string ("s") to add a line break after each existing line break:
>
> s = string.gsub(s, "<BR>", "<BR>\n<BR>")
>
Other tools may work as well. Furthermore, your HTML processing substitution
will not work if one has "<br>" or "<br />" or "<br/>" for newlines or uses the
more recommended https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p
element.
Also see:
* https://perl-begin.org/uses/text-parsing/
* https://blog.codinghorror.com/parsing-html-the-cthulhu-way/
> On writing Pandoc filters with Lua, see <https://pandoc.org/lua-filters.html>.
>
> Best regards,
>
> Paul
>
--
Shlomi Fish https://www.shlomifish.org/
https://is.gd/MQHVF3 - The Atom Text Editor edits a 2,000,001B file
Joel’s Generalisation: If it happens to you, it happens to everybody.
(Or: It’s never only you.)
— Based on http://www.joelonsoftware.com/news/20020402.html
Please reply to list if it's a mailing list post - http://shlom.in/reply .
More information about the Blinux-list
mailing list