Deep Linear Networks: Characterizing the Minimizer

This article is about characterization of the minimizer reached by gradient descent in deep neural networks. Part 3 of my notes on implicit regularisation in Deep Linear Networks. Download as PDF