$$ \newcommand{\dint}{\mathrm{d}} \newcommand{\vphi}{\boldsymbol{\phi}} \newcommand{\vpi}{\boldsymbol{\pi}} \newcommand{\vpsi}{\boldsymbol{\psi}} \newcommand{\vomg}{\boldsymbol{\omega}} \newcommand{\vsigma}{\boldsymbol{\sigma}} \newcommand{\vzeta}{\boldsymbol{\zeta}} \renewcommand{\vx}{\mathbf{x}} \renewcommand{\vy}{\mathbf{y}} \renewcommand{\vz}{\mathbf{z}} \renewcommand{\vh}{\mathbf{h}} \renewcommand{\b}{\mathbf} \renewcommand{\vec}{\mathrm{vec}} \newcommand{\vecemph}{\mathrm{vec}} \newcommand{\mvn}{\mathcal{MN}} \newcommand{\G}{\mathcal{G}} \newcommand{\M}{\mathcal{M}} \newcommand{\N}{\mathcal{N}} \newcommand{\S}{\mathcal{S}} \newcommand{\I}{\mathcal{I}} \newcommand{\diag}[1]{\mathrm{diag}(#1)} \newcommand{\diagemph}[1]{\mathrm{diag}(#1)} \newcommand{\tr}[1]{\text{tr}(#1)} \renewcommand{\C}{\mathbb{C}} \renewcommand{\R}{\mathbb{R}} \renewcommand{\E}{\mathbb{E}} \newcommand{\D}{\mathcal{D}} \newcommand{\inner}[1]{\langle #1 \rangle} \newcommand{\innerbig}[1]{\left \langle #1 \right \rangle} \newcommand{\abs}[1]{\lvert #1 \rvert} \newcommand{\norm}[1]{\lVert #1 \rVert} \newcommand{\two}{\mathrm{II}} \newcommand{\GL}{\mathrm{GL}} \newcommand{\Id}{\mathrm{Id}} \newcommand{\grad}[1]{\mathrm{grad} \, #1} \newcommand{\gradat}[2]{\mathrm{grad} \, #1 \, \vert_{#2}} \newcommand{\Hess}[1]{\mathrm{Hess} \, #1} \newcommand{\T}{\text{T}} \newcommand{\dim}[1]{\mathrm{dim} \, #1} \newcommand{\partder}[2]{\frac{\partial #1}{\partial #2}} \newcommand{\rank}[1]{\mathrm{rank} \, #1} \newcommand{\inv}1 \newcommand{\map}{\text{MAP}} \newcommand{\L}{\mathcal{L}} \DeclareMathOperator*{\argmax}{arg\,max} \DeclareMathOperator*{\argmin}{arg\,min} $$

InfoGAN: unsupervised conditional GAN in TensorFlow and Pytorch

Adding Mutual Information regularization to a GAN turns out gives us a very nice effect: learning data representation and its properties in unsupervised manner.

Maximizing likelihood is equivalent to minimizing KL-Divergence

We will show that doing MLE is equivalent to minimizing the KL-Divergence between the estimator and the true distribution.

Variational Autoencoder (VAE) in Pytorch

With all of those bells and whistles surrounding Pytorch, let's implement Variational Autoencoder (VAE) using it.

Generative Adversarial Networks (GAN) in Pytorch

Pytorch is a new Python Deep Learning library, derived from Torch. Contrary to Theano's and TensorFlow's symbolic operations, Pytorch uses imperative programming style, which makes its implementation more "Numpy-like".

Theano for solving Partial Differential Equation problems

We all know Theano as a forefront library for Deep Learning research. However, it should be noted that Theano is a general purpose numerical computing library, like Numpy. Hence, in this post, we will look at the implementation of PDE simulation in Theano.

Linear Regression: A Bayesian Point of View

You know the drill, apply mean squared error, then descend those gradients. But, what is the intuition of that process in Bayesian PoV?

MLE vs MAP: the connection between Maximum Likelihood and Maximum A Posteriori Estimation

In this post, we will see what is the difference between Maximum Likelihood Estimation (MLE) and Maximum A Posteriori (MAP).

Conditional Generative Adversarial Nets in TensorFlow

Having seen GAN, VAE, and CVAE model, it is only proper to study the Conditional GAN model next!

KL Divergence: Forward vs Reverse?

KL Divergence is a measure of how different two probability distributions are. It is a non-symmetric distance function, and each arrangement has its own interesting property, especially when we use it in optimization settings e.g. Variational Bayes method.

Conditional Variational Autoencoder: Intuition and Implementation

An extension to Variational Autoencoder (VAE), Conditional Variational Autoencoder (CVAE) enables us to learn a conditional distribution of our data, which makes VAE more expressive and applicable to many interesting things.