TheEffectofNetworkWidthonStochasticGradientDescentandGeneralization:anEmpiricalStudyDanielS.Park12JaschaSohl-Dickstein1QuocV.Le1SamuelL.Smith3AbstractWilsonetal.,2017;Sagunetal.,2017;Mandtetal.,201...
WidthProvablyMattersinOptimizationforDeepLinearNeuralNetworksSimonS.Du1WeiHu2Abstractconvergestoglobalminimumunderfurtherassumptionsonbothdataandglobalminimum.TheseresultsrequireWeprovethatforanL-l...