For machine learning analysis of sound, I've found nothing that beats a spectrogram.
While making this post, I wondered, can you make audio from a spectrogram? The answer is, apparently you can. https://stackoverflow.com/questions/57967487/convert-spectrogram-to-audio-using-librosa-functions
@lhackworth Of course you can make audio from a spectrogram, just do a reverse FFT on it and your back in the time domain and have your audio signal :)
Now whats really cool is that you can **manipulate** a spectrogram and meaningfully manipulate audio...
1) convert audio to a spectrogram (FFT used to make it into frequency domain representation essentially).
2) manipulate as you wish, for example if you want to remove a high pitched squeal that happens to ruin the audio just 0 out or otherwise visually remove the squeak from the spectrogram. In its simplest form this would look like just blacking out/0'ing the horizontal line where it shows up.
3) convert the modified spectrogram back into audio (do a reverse FFT).
You will now have audio that should be relatively the same as you started but without the squeel removed.