I read a quip from someone today that suggested "modern GPUs" don't need more than 8x PCIe 3.0 lanes (8 GiB)... which amused me since I just discovered my CPU+Motherboard combo (a Ryzen 3400G & B450) was capping my GPU at exactly that.
Speaking of GPUs, I managed to pick up an RTX A4000 with 16 GB of VRAM for a reasonable price (it was a former mining card). It currently trains 4x faster than the GTX 1080, but if "Tensor Cores" specs are to be believed, it should support even more. 🤔
When I dug into it, 8x on the primary 16x slot is a "feature" of the Ryzen 3400G (and older Ryzen CPUs). Most other 3000 series CPUs don't have this limitation. 😑
As for the B450 chipset, some rare motherboards do actually support 16x PCIe 4.0 (it's a CPU feature not chipset), but only with capable CPUs and only in beta firmwares that AMD requested partners not ship with. 😑
Replacing the CPU will double my available bandwidth. Replacing both will quadruple it. I hope doubling is enough. 😅
@dpwiz Having a "dedicated box" is the plan. PyTorch has a "job server" feature, so in theory any machine on my network could send jobs to it.
Right now I'm just SSH'ing in from my laptop.