Hello,
For my activist publishing, I am doing a lot of post-#OCR formatting manually. One of the most tedious tasks is to join several lines back into one paragraph, so they can be formatted (or machine-translated) properly.
Right now, I have 2022 lines of text to be processed like that, using my favourite #Gedit.

What could help? Possibly a #plugin, doing few simple actions:
-- grab the text manually selected,
-- replace all non-hyphenated ends of line with space
-- remove all hanging hyphens and their EOLs.

If anyone knows such a plugin, or perhaps another text editor with such functionality, or maybe would like to volunteer to write such a piece of code, please let me know.

@petros I know how to do this from the CLI or in vim, or even in gedit, but without restricting to selected text. I'm not sure whether any of these would be helpful for you.

In particular I could write a script for you that does this for the whole file, and then you could only manually add hyphens before line breaks you want to keep before running the script? An annoying workaround, but I'm not sure what would be helpful so I'm mentioning it.

@petros Oh, and I could also help with something like "Select text→Ctrl+C→switch to terminal→run script→switch to gedit→Crtl+V", but that might also not be helpful.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.