oh i forgot, from now on i won't save any "quasi" database to a csv, it'll be mere json or just sqllite
live coding on twitch last weekend. so much thing to learn, interaction (twitch driven development) is really hard because im really bad on context switching. but technical wise, learnt a lot about how to make personal todo manager on stream, project management via markdowns 😂 and few other things. I don't really recommend streaming about learning stuff, but doing menial task, having that impartial observer is actually pretty nice. it makes my motivation to do boring stuff much greater.
wow, what a pitch
wow, a "roam" research like in vscode. zettelkasten like knowledge manager. pretty nifty stuff, used for my stream and learning notes from now on!
and it uses FAISS, the infamous nmap indexer.
approximate vector search (neighbor search ones, like annoy/hnsw/milvus), especially on <10k docs is actually shitty (it'll return random shit, way out of the query context). despite what word / sentence embedding used.
it's too bad DPR dataset are TREC, which mean it's still a supervised learning based :( hoping something like beam search like on google's meena (the awesome open domain chatbot)
starting to see the pattern that i don't like on #100daysoffload so it's a glorified toot / tweet aint it?
Non-english rant
seminggu ini balik ke basic lagi, belajar banyak data struktur apalagi tree, dan graf soalnya buat search result dan auto complete ini emang masih terlalu muda untuk mencoba word2vec / doc2vec biar dapet deterministic result. Pas liat2 solusi leetcode di youtube jadi galau pengen beli wacom tab. tapi kepikiran lagi kan lebih cepet ngetik daripada nulis, dan gue juga gak bisa ngegambar wkwk
well sometimes poison is good. https://github.com/neulab/RIPPLe
lifelong pattern hunter