4.5 Natural Gradients

When using the symmetric KL divergence to measure the distance between two distributions, the corresponding Riemannian metric is the Fisher Information Matrix (FIM), defined as the Hessian matrix of the KL divergence around θ, i.e. the matrix of second order derivatives w.r.t the elements of θ.

The Fisher information matrix is defined as the Hessian of the KL divergence around θ, i.e. how the manifold locally changes around θ: \(F(\theta)=\nabla^2D_{JS}(p(x;\theta)\mid\mid p(x;\theta + \Delta\theta))\mid_{\Delta\theta=0}\) Why is it useful? The Fisher Information matrix allows to locally approximate (for small Δθ) the KL divergence between the two close distributions (using a second-order Taylor series expansion): \(F(\theta)=\mathbb{E}_{x\sim p(x,\theta)}[\nabla log p(x;\theta)(\nabla log p(x;\theta))^T]\)

\(D_{JS}(p(x;\theta) \mid\mid p(x;\theta+\Delta\theta))\approx\Delta\theta^TF(\theta)\Delta\theta\) The KL divergence is then locally quadratic, which means that the update rules obtained when minimizing the KL divergence with gradient descent will be linear.

Reference

https://julien-vitay.net/deeprl/NaturalGradient.html#sec:natural-policy-gradient-and-natural-actor-critic-nac

Hoosang Lee

4.5 Natural Gradients

Reference

댓글남기기

참고

[Robot Analysis] 2. Position Analysis of Serial Manipulators

[M1 Macbook Air] VS code 사용시 한글이 제대로 입력되지 않는 경우 (헌글 씹힘 현상)

ROS를 이용해 여러 대의 카메라 fusion 해보기

Git push시 에러 발생 해결법 - error: failed to push some refs to ~