Optimization Algorithms
Mini-Batch Gradient Descent
: Optimization Algorithm을 빠르게 돌리기 위해서 m개의 예제를 512, 1024,.. 개씩 나눠서 처리하는 방법
![Untitled](https://s3-us-west-2.amazonaws.com/secure.notion-static.com/2be95414-9197-43c3-81d7-3b637876c390/Untitled.png)
- Mini Batch Gradient Descent를 식으로 나타내면 아래와 같다.
총 5000이 있고, Mini Batch size를 1000이라 가정하자
1 epoch(=a single pass through training set) ⇒ 5000 Gradient Descent Step
![Untitled](https://s3-us-west-2.amazonaws.com/secure.notion-static.com/10b881f7-172e-46f7-9fec-9ebad87edcf8/Untitled.png)
반면에, Batch Gradient Descent는
1 epoch ⇒ 1 Gradient Descent Step
Understanding Mini-Gradient Descent
- Batch Gradient Descent vs Mini Batch Gradient Descent
![Untitled](https://s3-us-west-2.amazonaws.com/secure.notion-static.com/cca3461e-94bd-4c53-8bfc-cbb67e91da4e/Untitled.png)
![Untitled](https://s3-us-west-2.amazonaws.com/secure.notion-static.com/465b86f9-8e75-4677-8e58-533682462bea/Untitled.png)
Exponentially Weighted Averaged
![Untitled](https://s3-us-west-2.amazonaws.com/secure.notion-static.com/a19b723e-3773-4192-bde8-6b63ae1b68b2/Untitled.png)
Bias Correction
![Untitled](https://s3-us-west-2.amazonaws.com/secure.notion-static.com/ff205c0b-8408-485c-b619-8b9251c51c51/Untitled.png)
Gradient Descent with Momentum