I'm not sure if that's anyhow fast, but you could perform the comparison (ending with an all-ones byte where it succeeded and all-zeroes elsewhere), AND it with a vector that has (2^i)-i in i-th position, OR it up, add 1, and use any of the scalar ways of figuring out which power of two that is.
(Alternatively, have the second vector have 2^i in i-th position, AND it with first, add them up and use a scalar way to figure out which is the highest bit set.
@robryk but excited to get my class working on this sort of thing, we'll see who creates fast code. I don't want to present them with a finished solution, that's boring!