benching 10 fast, non-cryptographically secure hashing algos on a small text file
@schlink By the way, I dunno if this is helpful, but a while back I developed an application where I wanted to know if I had duplicate files anywhere on a disk.
Rather than hash the entire disk just to build an index, I used some deterministic algorithm to choose a small pseudorandom subset of each file and hash that to build my index. There were collisions among non-duplicate files, but they were few and far between (and often not even among files with the same size), and it was a simple matter to do a full hash on the small subset of files with collisions.
QOTO: Question Others to Teach Ourselves. A STEM-oriented instance.
An inclusive free speech instance.
All cultures and opinions welcome.
Explicit hate speech and harassment strictly forbidden.
We federate with all servers: we don't block any servers.