Hey lazymastodon, I have a linear algebra question.
So I've been thinking a bit about principle component analysis as of late. The way to find the vector of most variance in a multidimensional dataset is to put every datapoint in a column matrix, multiply that matrix by its transpose, and find the eigenvectors of the resulting square matrix.
Here's my question: I don't have a good intuition for what "multiply the matrix by its transpose" is doing. That compares every point to every other point by multiplying the same-dimension components together and summing the result across dimensions, but like... Why does that result in an interesting matrix instead of a pile of noise?
Career software engineer living something approximating the dream he had as a kid.