Decided to finally implement a deep neural network engine from scratch.

Tricky bit is getting the architecture and settings done in such a way that it can train against the MNIST dataset within a sane amount of time on my VPS.

Show thread

So far, the architecture is 728x128x32x10, learning rate of 0.001 learning rate, sigmoid activation for all neurons, and using stochastic gradient descent. It's definitely lowering the error over time, but it's at such a slow rate that it'd take ~2 months to actually start guessing inputs correctly (which is definitely not ideal).

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.