11/6/2023 Difference between SGD and ADAM by Xiaoxi Shen and Jialong Li

Abstract: With the growth of dimensionality of the data we encounter nowadays, the classical gradient descent methods need to speed up. In the presentation, we will start by reviewing the gradient descent and stochastic gradient descent and discuss their theoretical guarantees. After that, more commonly used optimization algorithms in deep learning will be introduced and…

Back To Top
Skip to toolbar