Idea for stable diffusion: train a model to correct an image which has been randomly deformed. It may be cheap enough to use perlin noise, or similar, to generate random deformations, but otherwise something like GIMP's pick noise, which just randomly exchanges pixels with nearby pixels n times, may be faster
Theoretically, you could use a regular image as the initial noisy image, and the model would then deform it to match what it thinks is the denoised equivalent. This might allow for, eg: correction of anatomical problems for characters, composition problems, etc