We developed a framework based on @zarr_dev to benchmark lossless and lossy compression of #Neuropixels and similar data. The benchmark datasets included NP1 and NP2 recordings, available on Registry of Open Data on
@AWS
https://registry.opendata.aws/allen-nd-ephys-compression/
(2/n)
We then investigated two #LossyCompression strategies: bit truncation and WavPack Hybrid mode. Lossy compression can dramatically boost compression performance, but we must first assess how it affects downstream analysis (i.e., spike sorting).
(5/n)
Using simulated data with known ground truth spike times, we used #Kilosort 2.5 to evaluate spike sorting performance. WavPack Hybrid does not affect spike sorting accuracy, even at maximum compression levels (~14% file size).
(6/n)
At @AllenInstitute
for Neural Dynamics we value fairness and reproducibility in science. All figures of the manuscript can be reproduced with
@codeocean:
https://codeocean.com/capsule/3822095/tree/v1
(11/n)
Finally, kudos to all co-authors!
Olivier Winter, David Bryant, David Feng, @svoboda314 and Josh Siegle, and thanks to
@alleninstitute for sponsoring this work!
(12/n)
@buccino_alessio 👍 I always wondered if ephys and (human) audio signals are similar for some deep reason or just coincidence. I guess they are both roughly 1/f with about 20 kHz bandwidth
We started with #LosslessCompression. Across a range of general-purpose (GP) compressors, we found that #Zstandard with
@Blosc2 achieves the best compromise between compression ratio and decompression speed!
NP1: compressed size ~36%
NP2: compressed size ~52%
(3/n)