$$ \newcommand{\dint}{\mathrm{d}} \newcommand{\vphi}{\boldsymbol{\phi}} \newcommand{\vpi}{\boldsymbol{\pi}} \newcommand{\vpsi}{\boldsymbol{\psi}} \newcommand{\vomg}{\boldsymbol{\omega}} \newcommand{\vsigma}{\boldsymbol{\sigma}} \newcommand{\vzeta}{\boldsymbol{\zeta}} \renewcommand{\vx}{\mathbf{x}} \renewcommand{\vy}{\mathbf{y}} \renewcommand{\vz}{\mathbf{z}} \renewcommand{\vh}{\mathbf{h}} \renewcommand{\b}{\mathbf} \renewcommand{\vec}{\mathrm{vec}} \newcommand{\vecemph}{\mathrm{vec}} \newcommand{\mvn}{\mathcal{MN}} \newcommand{\G}{\mathcal{G}} \newcommand{\M}{\mathcal{M}} \newcommand{\N}{\mathcal{N}} \newcommand{\S}{\mathcal{S}} \newcommand{\I}{\mathcal{I}} \newcommand{\diag}[1]{\mathrm{diag}(#1)} \newcommand{\diagemph}[1]{\mathrm{diag}(#1)} \newcommand{\tr}[1]{\text{tr}(#1)} \renewcommand{\C}{\mathbb{C}} \renewcommand{\R}{\mathbb{R}} \renewcommand{\E}{\mathbb{E}} \newcommand{\D}{\mathcal{D}} \newcommand{\inner}[1]{\langle #1 \rangle} \newcommand{\innerbig}[1]{\left \langle #1 \right \rangle} \newcommand{\abs}[1]{\lvert #1 \rvert} \newcommand{\norm}[1]{\lVert #1 \rVert} \newcommand{\two}{\mathrm{II}} \newcommand{\GL}{\mathrm{GL}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\grad}[1]{\mathrm{grad} \, #1} \newcommand{\gradat}[2]{\mathrm{grad} \, #1 \, \vert_{#2}} \newcommand{\Hess}[1]{\mathrm{Hess} \, #1} \newcommand{\T}{\text{T}} \newcommand{\dim}[1]{\mathrm{dim} \, #1} \newcommand{\partder}[2]{\frac{\partial #1}{\partial #2}} \newcommand{\rank}[1]{\mathrm{rank} \, #1} \newcommand{\inv}1 \newcommand{\map}{\text{MAP}} \newcommand{\L}{\mathcal{L}} \DeclareMathOperator*{\argmax}{arg\,max} \DeclareMathOperator*{\argmin}{arg\,min} $$

Modern Arts of Laplace Approximations

The Laplace approximation (LA) is a simple yet powerful class of methods for approximating intractable posteriors. Yet, it is largely forgotten in the Bayesian deep learning community. Here, we review the LA, and highlight a recent software library for applying LA to deep nets.

Chentsov's Theorem

The Fisher information is often the default choice of the Riemannian metric for manifolds of probability distributions. In this post, we study Chentsov's theorem, which justifies this choice. It says that the Fisher information is the unique Riemannian metric (up to a scaling constant) that is invariant under sufficient statistics. This fact makes the Fisher metric stands out from other choices.

The Curvature of the Manifold of Gaussian Distributions

The Gaussian probability distribution is central in statistics and machine learning. As it turns out, by equipping the set of all Gaussians p.d.f. with a Riemannian metric given by the Fisher information, we can see it as a Riemannian manifold. In this post, we will prove that this manifold can be covered by a single coordinate chart and has a constant negative curvature.

Hessian and Curvatures in Machine Learning: A Differential-Geometric View

In machine learning, especially in neural networks, the Hessian matrix is often treated synonymously with curvatures. But, from calculus alone, it is not clear why can one say so. Here, we will view the loss landscape of a neural network as a hypersurface and apply a differential-geometric analysis on it.

Optimization and Gradient Descent on Riemannian Manifolds

One of the most ubiquitous applications in the field of differential geometry is the optimization problem. In this article we will discuss the familiar optimization problem on Euclidean spaces by focusing on the gradient descent method, and generalize them on Riemannian manifolds.

Notes on Riemannian Geometry

This article is a collection of small notes on Riemannian geometry that I find useful as references. It is largely based on Lee's books on smooth and Riemannian manifolds.

Minkowski's, Dirichlet's, and Two Squares Theorem

Application of Minkowski's Theorem in geometry problems, Dirichlet's Approximation Theorem, and Two Squares Theorem.

Reduced Betti number of sphere: Mayer-Vietoris Theorem

A proof of reduced homology of sphere with Mayer-Vietoris sequence.

Brouwer's Fixed Point Theorem: A Proof with Reduced Homology

A proof of special case (ball) of Brouwer's Fixed Point Theorem with Reduced Homology.

Natural Gradient Descent

Intuition and derivation of natural gradient descent.