Follow

it's too bad DPR dataset are TREC, which mean it's still a supervised learning based :( hoping something like beam search like on google's meena (the awesome open domain chatbot)

arxiv.org/pdf/2004.04906.pdf

and it uses FAISS, the infamous nmap indexer.

approximate vector search (neighbor search ones, like annoy/hnsw/milvus), especially on <10k docs is actually shitty (it'll return random shit, way out of the query context). despite what word / sentence embedding used.

Show thread
Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.