# Blog Posts

- Writing Advice for Fledging Machine Learning Researchers
Some bullet-point advice regarding paper-writing I would give to my younger self.

- The 'use' Expression in Gleam
How can we emulate the behavior of Python's `with` and Rust `?` in Gleam?

- Volume Forms and Probability Density Functions Under Change of Variables
From elementary probability theory, it is well known that a probability density function (pdf) is not invariant under an arbitrary ...

- The Invariance of the Hessian and Its Eigenvalues, Determinant, and Trace
In deep learning, the Hessian and its downstream quantities are observed to be not invariant under reparametrization. This makes the ...

- Convolution of Gaussians and the Probit Integral
Gaussian distributions are very useful in Bayesian inference due to their (many!) convenient properties. In this post we take a ...

- The Last Mile of Creating Publication-Ready Plots
In machine learning papers, plots are often treated as afterthought---authors often simply use the default Matplotlib style, resulting in an ...

- Modern Arts of Laplace Approximations
The Laplace approximation (LA) is a simple yet powerful class of methods for approximating intractable posteriors. Yet, it is largely ...

- Chentsov's Theorem
The Fisher information is often the default choice of the Riemannian metric for manifolds of probability distributions. In this post, ...

- The Curvature of the Manifold of Gaussian Distributions
The Gaussian probability distribution is central in statistics and machine learning. As it turns out, by equipping the set of ...

- Hessian and Curvatures in Machine Learning: A Differential-Geometric View
In machine learning, especially in neural networks, the Hessian matrix is often treated synonymously with curvatures. But, from calculus alone, ...

- Optimization and Gradient Descent on Riemannian Manifolds
One of the most ubiquitous applications in the field of differential geometry is the optimization problem. In this article we ...

- Minkowski's, Dirichlet's, and Two Squares Theorem
Application of Minkowski's Theorem in geometry problems, Dirichlet's Approximation Theorem, and Two Squares Theorem.

- Reduced Betti number of sphere: Mayer-Vietoris Theorem
A proof of reduced homology of sphere with Mayer-Vietoris sequence.

- Brouwer's Fixed Point Theorem: A Proof with Reduced Homology
A proof of special case (ball) of Brouwer's Fixed Point Theorem with Reduced Homology.

- Natural Gradient Descent
Intuition and derivation of natural gradient descent.

- Fisher Information Matrix
An introduction and intuition of Fisher Information Matrix.

- Introduction to Annealed Importance Sampling
An introduction and implementation of Annealed Importance Sampling (AIS).

- Gibbs Sampler for LDA
Implementation of Gibbs Sampler for the inference of Latent Dirichlet Allocation (LDA)

- Boundary Seeking GAN
Training GAN by moving the generated samples to the decision boundary.

- Least Squares GAN
2017 is the year GAN loss its logarithm. First, it was Wasserstein GAN, and now, it's LSGAN's turn.

- CoGAN: Learning joint distribution with GAN
Original GAN and Conditional GAN are for learning marginal and conditional distribution of data respectively. But how can we extend ...

- Wasserstein GAN implementation in TensorFlow and Pytorch
Wasserstein GAN comes with promise to stabilize GAN training and abolish mode collapse problem in GAN.

- InfoGAN: unsupervised conditional GAN in TensorFlow and Pytorch
Adding Mutual Information regularization to a GAN turns out gives us a very nice effect: learning data representation and its ...

- Maximizing likelihood is equivalent to minimizing KL-Divergence
We will show that doing MLE is equivalent to minimizing the KL-Divergence between the estimator and the true distribution.

- Variational Autoencoder (VAE) in Pytorch
With all of those bells and whistles surrounding Pytorch, let's implement Variational Autoencoder (VAE) using it.

- Generative Adversarial Networks (GAN) in Pytorch
Pytorch is a new Python Deep Learning library, derived from Torch. Contrary to Theano's and TensorFlow's symbolic operations, Pytorch uses ...

- Theano for solving Partial Differential Equation problems
We all know Theano as a forefront library for Deep Learning research. However, it should be noted that Theano is ...

- Linear Regression: A Bayesian Point of View
You know the drill, apply mean squared error, then descend those gradients. But, what is the intuition of that process ...

- MLE vs MAP: the connection between Maximum Likelihood and Maximum A Posteriori Estimation
In this post, we will see what is the difference between Maximum Likelihood Estimation (MLE) and Maximum A Posteriori (MAP).

- Conditional Generative Adversarial Nets in TensorFlow
Having seen GAN, VAE, and CVAE model, it is only proper to study the Conditional GAN model next!

- KL Divergence: Forward vs Reverse?
KL Divergence is a measure of how different two probability distributions are. It is a non-symmetric distance function, and each ...

- Conditional Variational Autoencoder: Intuition and Implementation
An extension to Variational Autoencoder (VAE), Conditional Variational Autoencoder (CVAE) enables us to learn a conditional distribution of our data, ...

- Variational Autoencoder: Intuition and Implementation
Variational Autoencoder (VAE) (Kingma et al., 2013) is a new perspective in the autoencoding business. It views Autoencoder as a ...

- Deriving Contractive Autoencoder and Implementing it in Keras
Contractive Autoencoder is more sophisticated kind of Autoencoder compared to the last post. Here, we will dissect the loss function ...

- Many flavors of Autoencoder
Autoencoder is a family of methods that answers the problem of data reconstruction using neural net. There are several variation ...

- Level Set Method Part II: Image Segmentation
Level Set Method is an interesting classical (pre deep learning) Computer Vision method based on Partial Differential Equation (PDE) for ...

- Level Set Method Part I: Introduction
Level Set Method is an interesting classical (pre deep learning) Computer Vision method based on Partial Differential Equation (PDE) for ...

- Residual Net
In this post, we will look into the record breaking convnet model of 2015: the Residual Net (ResNet).

- Generative Adversarial Nets in TensorFlow
Let's try to implement Generative Adversarial Nets (GAN), first introduced by Goodfellow et al, 2014, with TensorFlow. We'll use MNIST ...

- How to Use Specific Image and Description when Sharing Jekyll Post to Facebook
Normally, random subset of pictures and the site's description will be picked when we shared our Jekyll blog post URL ...

- Deriving LSTM Gradient for Backpropagation
Deriving neuralnet gradient is an absolutely great exercise to understand backpropagation and computational graph better. In this post we will ...

- Convnet: Implementing Maxpool Layer with Numpy
Another important building block in convnet is the pooling layer. Nowadays, the most widely used is the max pool layer. ...

- Convnet: Implementing Convolution Layer with Numpy
Convnet is dominating the world of computer vision right now. What make it special of course the convolution layer, hence ...

- Implementing BatchNorm in Neural Net
BatchNorm is a relatively new technique for training neural net. It gaves us a lot of relaxation when initializing the ...

- Implementing Dropout in Neural Net
Dropout is one simple way to regularize a neural net model. This is one of the recent advancements in Deep ...

- Beyond SGD: Gradient Descent with Momentum and Adaptive Learning Rate
There are many attempts to improve Gradient Descent: some add momentum, some add adaptive learning rate. Let's see what's out ...

- Implementing Minibatch Gradient Descent for Neural Networks
Let's use Python and Numpy to implement Minibatch Gradient Descent algorithm for a simple 3-layers Neural Networks.

- Paralellizing Monte Carlo Simulation in Python
Monte Carlo simulation is all about quantity. It can take a long time to complete. Here's how to speed it ...

- Scrapy as a Library in Long Running Process
Scrapy is a great web crawler framework, but it's tricky to make it runs as a library in a long-running ...

- Gaussian Anomaly Detection
In Frequentist and Bayesian Way

- Slice Sampling
An implementation example of Slice Sampling for a special case: unimodal distribution with known inverse PDF

- Rejection Sampling
Rejection is always painful, but it's for the greater good! You can sample from a complicated distribution by rejecting samples!

- Metropolis-Hastings
An implementation example of Metropolis-Hastings algorithm in Python.

- Gibbs Sampling
Example of Gibbs Sampling implementation in Python to sample from a Bivariate Gaussian.

- Twitter Authentication with Tweepy and Flask
A tutorial on how to do Twitter OAuth authentication in Flask web application.

- Deploying Wagtail App
In this post, I'll show you how to deploy our blog and how to solve some common problems when deploying ...

- Developing Blog with Wagtail
My experience on building this blog using Wagtail CMS, with zero Django knowledge. Let’s code our blog!

- Setting Up Wagtail Development Environment
My experience on building a blog using Wagtail CMS, with zero Django knowledge. I’ll walk you through from scratch up ...