3 results
Flat high-dimensional regions slowing training.
Restricting updates to safe regions.
A wide basin often correlated with better generalization.