**Christopher Pollin** @chpollin@fedihum.org · Feb 03, 2026, 09:39 *

**Christopher Pollin** @chpollin@fedihum.org · Feb 03, 2026, 09:39 *

Christopher Pollin @chpollin@fedihum.org

Feb 03, 2026, 09:39 *

Christopher Pollin @chpollin@fedihum.org

I'm currently working on a OCR/HTR "editor-in-the-loop" browser tool.

It has rule-based and LLM-based validation recommendations. You can load Page XML, IIIF and images into it, and use Gemini 3 Flash (or whatever you want to use) for transcribing (or your local DeepSeek OCR 2 via Ollama), before exporting it in different formats. HTR will be getting more tricky. But for OCR the DeepSeek OCR 2 is very good.

e34ff59fe3e20d27.png

**Ingrid Mason** @ingridbmason@ausglam.space · Feb 03, 2026, 11:07

**Ingrid Mason** @ingridbmason@ausglam.space · Feb 03, 2026, 11:07

Feb 03, 2026, 11:07

Ingrid Mason @ingridbmason@ausglam.space

@chpollin interested to know what tool you’re using. Is it possible to share a link to information about it?

**Ingrid Mason** @ingridbmason@ausglam.space · Feb 03, 2026, 11:16

**Ingrid Mason** @ingridbmason@ausglam.space · Feb 03, 2026, 11:16

Feb 03, 2026, 11:16

Ingrid Mason @ingridbmason@ausglam.space

@chpollin found it, for other curious eyes. 👀 https://github.com/DigitalHumanitiesCraft/co-ocr-htr

**Christopher Pollin** @chpollin@fedihum.org · Feb 03, 2026, 12:04 *

**Christopher Pollin** @chpollin@fedihum.org · Feb 03, 2026, 12:04 *

Feb 03, 2026, 12:04 *

Christopher Pollin @chpollin@fedihum.org

@ingridbmason Yes, that is the repository. It is a work in progress and is built using Claude Code. I am using a specific way of representing context information in an Obsidian-like structure (= knowledge folder). :)

**Martin Ruskov** @mapto@qoto.org · Feb 03, 2026, 12:27

**Martin Ruskov** @mapto@qoto.org · Feb 03, 2026, 12:27

Feb 03, 2026, 12:27

Martin Ruskov @mapto@qoto.org

@chpollin @ingridbmason "integrating domain experts". I'd suggest reconsidering this formulation to better reflect what roles you would like to give to people and agents. It might not sound like a big thing, but human-in-the-loop is kinda the opposite of computer-assisted.

**Christopher Pollin** @chpollin@fedihum.org · Feb 03, 2026, 12:31

**Christopher Pollin** @chpollin@fedihum.org · Feb 03, 2026, 12:31

Feb 03, 2026, 12:31

Christopher Pollin @chpollin@fedihum.org

@mapto @ingridbmason Thank you very much! Yes, absolutely you're right and this is also what I was thinking about. Putting the editor/expert in the center. It's just me alone working on this from home, so I very much appreciate such feedback. I will adapt this! :)

**Christopher Pollin** @chpollin@fedihum.org · Feb 03, 2026, 12:36

**Christopher Pollin** @chpollin@fedihum.org · Feb 03, 2026, 12:36

Feb 03, 2026, 12:36

Christopher Pollin @chpollin@fedihum.org

@mapto @ingridbmason Edit: Maybe "computer-assisted" is too broad as well? Do you have any resources on framing this better? I find naming and framing these human/AI relationships an important topic. :)

**Martin Ruskov** @mapto@qoto.org · 2026-02-05T03:20:26Z

Martin Ruskov @mapto@qoto.org

@chpollin @ingridbmason I have found the persuasive technology triad to be quite relevant: https://en.wikipedia.org/wiki/Persuasive_technology#Functional_triad . With some wishful thinking this could be translated to computer-aided/-assisted (as in CAD, CAM, CALL, etc), computer-supported (as in CSCW/CSCL) and computer-generated as in GenAI.

Feb 05, 2026, 03:20 · · Moshidon · · ·

**Christopher Pollin** @chpollin@fedihum.org · Feb 05, 2026, 04:53 *

**Christopher Pollin** @chpollin@fedihum.org · Feb 05, 2026, 04:53 *

Feb 05, 2026, 04:53 *

Christopher Pollin @chpollin@fedihum.org

@mapto @ingridbmason thank you very much! :)

i like it!

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…