I've been having fun training a transformer to generate fly behavior. There's a lot more work to do, but I kind of love watching my little synthetic fly! In this video, I'm plotting the positions of 19 keypoints on each fly -- its pose. Thin-lined flies are real tracked flies, and the thick-lined fly is synthetic. This model is trained to predict one frame (1/150th of a second) into the future, but is run in open loop for 1000 frames (6.7 seconds).
@twitskeptic I only look at videos of insects, not insects in real life. That makes them a lot cuter :). The model is quite small right now (relatively speaking), since I'm just debugging, and currently just using the first set of hyperparameters i tried. It's 6 transformer layers, 8 heads, hidden dim = 512. i gave it relatively little context (64 frames), and am now trying what happens if i give it 512 frames of context.