Optimization Algorithms

Mini-Batch Gradient Descent

: Optimization Algorithm을 빠르게 돌리기 위해서 m개의 예제를 512, 1024,.. 개씩 나눠서 처리하는 방법

Untitled

총 5000이 있고, Mini Batch size를 1000이라 가정하자

1 epoch(=a single pass through training set) ⇒ 5000 Gradient Descent Step

Untitled

반면에, Batch Gradient Descent는

1 epoch ⇒ 1 Gradient Descent Step

Understanding Mini-Gradient Descent

Untitled

Untitled

Exponentially Weighted Averaged

Untitled

Bias Correction

Untitled

Gradient Descent with Momentum