**tdietterich** @tdietterich@mastodon.social · Mar 31, 2024, 04:53

**tdietterich** @tdietterich@mastodon.social · Mar 31, 2024, 04:53

tdietterich @tdietterich@mastodon.social

Mar 31, 2024, 04:53

tdietterich @tdietterich@mastodon.social

Is anyone studying citation rings or citation-for-profit in arXiv
papers? I recently found a submission where the citations had nothing to do with the topic of the paper. Would love for someone to look systematically for this. Could LLMs (or just embeddings) help?

**Martin Ruskov** @mapto@qoto.org · Mar 31, 2024, 05:02

**Martin Ruskov** @mapto@qoto.org · Mar 31, 2024, 05:02

Mar 31, 2024, 05:02

Martin Ruskov @mapto@qoto.org

@tdietterich yes, LLMs certainly help in citing references that have nothing to do with the topic of the paper... if these references exist at all.

Or did you mean that the most recent cause of the problem could be part of its solution?

**tdietterich** @tdietterich@mastodon.social · Mar 31, 2024, 05:04

**tdietterich** @tdietterich@mastodon.social · Mar 31, 2024, 05:04

Mar 31, 2024, 05:04

tdietterich @tdietterich@mastodon.social

@mapto What I had in mind was checking whether the topics of the citations matched the topic of the paper. One could be more precise and look at the citation context to see whether it matched the paper being cited.

**Martin Ruskov** @mapto@qoto.org · 2024-03-31T06:17:50Z

Martin Ruskov @mapto@qoto.org

@tdietterich I am looking forward to hear someone respond positively to your call, because I'm overly sceptical about the reliability of such assessments made by language models.

Of course it depends on what kind of "matches" you're after. For example at this stage, I tend to think different approaches are necessary for explicit vs implicit references. For the former it appears that LLMs are less appropriate than smaller bespoke models, for the latter it seems that across the board LLMs are ineffective - the level of sophistication of the related thought and language is way beyond what GenAI can do.

The starting point of these ideas comes from a couple of works (mine and of others) an early version of which were presented at this venue: https://aclanthology.org/volumes/2023.nlp4dh-1/ . Extended versions are due to appear here: https://jdmdh.episciences.org/volume/view/id/593 It's all work in progress though.

Mar 31, 2024, 06:17 · · Tusky · · ·

**tdietterich** @tdietterich@mastodon.social · Apr 01, 2024, 03:12

**tdietterich** @tdietterich@mastodon.social · Apr 01, 2024, 03:12

Apr 01, 2024, 03:12

tdietterich @tdietterich@mastodon.social

@mapto Thanks! I'm particularly interested in flagging suspicious submissions to arXiv. Some false positive flags are ok -- human moderators will review them all. But I would like to minimize false negatives.

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…