Implementation of the noisy-injected stochastic gradient descent proposed in the paper ‘‘Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization’’ by G. Wang, G. B. Giannakis, and J. Chen.
Learning ReLus (for single-hidden-layer neural networks)
Noise-injected SGD
Plain-vanilla SGD