Viewing a single comment thread. View all comments

bubudumbdumb t1_j28j898 wrote

TIL : Nesterov momentum is an extension of momentum that involves calculating the decaying moving average of the gradients of projected positions in the search space rather than the actual positions themselves.

I had a course on control theory and the ingredients of Nesterov momentum seem to be common building blocks of linear control systems: moving average and decay. PID control is the industrial application of linear control theory.

2