it's too bad DPR dataset are TREC, which mean it's still a supervised learning based :( hoping something like beam search like on google's meena (the awesome open domain chatbot)
approximate vector search (neighbor search ones, like annoy/hnsw/milvus), especially on <10k docs is actually shitty (it'll return random shit, way out of the query context). despite what word / sentence embedding used.
and it uses FAISS, the infamous nmap indexer.
approximate vector search (neighbor search ones, like annoy/hnsw/milvus), especially on <10k docs is actually shitty (it'll return random shit, way out of the query context). despite what word / sentence embedding used.