Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear FunctionsWe show that the representation cost of fully connected neural networks with
homogeneous nonlinearities - which describes the implicit bias in function
space of networks with $L_2$-regularization or with losses such as the
cross-entropy - converges as the depth of the network goes to infinity to a
notion of rank over nonlinear functions. We then inquire under which conditions
the global minima of the loss recover the `true' rank of the data: we show that
for too large depths the global minimum will be approximately rank 1
(underestimating the rank); we then argue that there is a range of depths which
grows with the number of datapoints where the true rank is recovered. Finally,
we discuss the effect of the rank of a classifier on the topology of the
resulting class boundaries and show that autoencoders with optimal nonlinear
rank are naturally denoising.
arxiv.org