in Brainfuck, loops that look like [>>>] are common. these are memory scans: this one sweeps memory in the positive direction, stopping when a zero-valued cell is found at a multiple of 3 from the starting position.

if I want this to go fast on Neon, I can load 16 bytes into a vector register, test them all for equality with zero, and then what? I guess add up the resulting vector and check if the sum is zero, and if it isn't, scan the elements one by one? is there a faster way to do this?

I'm a super noob at vector programming, like I just don't really do it

Follow

@adrian @regehr

Doesn't it do that in every vector element independently?

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.