본문 바로가기

Deep Learning

(2)

Batch Normalization Layer 를 많이 쌓게되면 학습을 하는 동안 각 layer 의 input 의 분포가 계속 달라지게 된다. 이런 현상을 internal covariate shift 라 하는데, 이로 인하여 모델의 학습이 어렵고, learning rate 를 낮게 셋팅해야 하는 문제가 발생한다. Batch normalization 은 internal covariate shift 를 해결하기 위해 layer 의 input batch 를 normalization 하는 방법이다. Batch normalization 은 non-linear activation funtion 앞에 배치되며, activation function 의 input 에 대하여 아래와 같은 transformation 을 적용한다. $n$ 은 batch siz..

직관적인 Universal Approximation Theorem 증명 Bias-variance trade-off 포스트에서 언급된 bias loss 를 줄이기 위해서는 feed-forward neural network 를 사용해볼 수 있다. 이런 feed-forward neural network 의 학습능력의 바탕에는 universal approximation theorem 이 있다. Universal approximation theorem 의 내용은 아래와 같다. 임의의 개수의 neuron 을 포함하고, activation function 이 sigmoid 이면서, 1 hidden layer 를 가진 feed-forward neural network 는 적절한 weights 만 주어진다면 어떤 함수든 근사화 할 수 있다. 컴퓨터공학도에게 위의 내용을 엄밀하게 증명하는 건..

이전 1 다음

티스토리툴바