So far, the architecture is 728x128x32x10, learning rate of 0.001 learning rate, sigmoid activation for all neurons, and using stochastic gradient descent. It's definitely lowering the error over time, but it's at such a slow rate that it'd take ~2 months to actually start guessing inputs correctly (which is definitely not ideal).
Tricky bit is getting the architecture and settings done in such a way that it can train against the MNIST dataset within a sane amount of time on my VPS.
The world has a near endless stream of options of music to listen to, a list that consists of masterpieces constructed by people who spent years working on their craft... Yet here I am listening to 10 hours of a cartoon cat singing EEEAAAOOO
https://youtu.be/v1K4EAXe2oo
On the plus side, I can work remotely until the repairs are done. Always good to be greatful for the little things.
Related gripe, it got cold last week and snowed over the weekend which resulted in my car now needing lots of repairs.
HPC systems engineer