Follow

On July 1st, I'll be presenting our work on the BogoSlov project (4euplus.eu/4EU-1150.html) at the conference "Transfer of Ideas in European Intellectual History: From Medieval Manuscripts to Interactive Online Content" held at the Institute of Slavistics of the University of Innsbruck @uniinnsbruck .

I will be presenting the four distinct approaches we take to attempt to identify in texts. The four approaches range from established formal algorithms like regular expressions, longest common subsequence, pass through conventional computational linguistics with lemma n-grams and conclude with an approach based on sentence transformers. I will be talking of the huge list of challenges the field is facing, including the need to support standards like TEI and take full advantage of recent versions of Unicode , even though sometimes with it ambiguities remain.

Looking forward to seeing many old and new friends there!

· Edited · · 0 · 2 · 0
Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.