/ 8 min read

# Convolution of Gaussians and the Probit Integral

Gaussian distributions are very useful in Bayesian inference due to their (many!) convenient properties. In this post we take a look at two of them: the convolution of two Gaussian pdfs and the integral of the probit function w.r.t. a Gaussian measure.

## Convolution and the Predictive Distribution of Gaussian Regression

Let’s start with the *convolution*

**Proposition 1 (Convolution of Gaussians)** *Let *

*Proof.*
By the convolution theorem, the convolution of two functions is equivalent to the product of the functions’ Fourier transforms.
The Fourier transform of a density function is given by its characteristic function.
For a Gaussian

which we can immediately identify as the characteristic function of a Gaussian with mean

This result is very useful in Bayesian machine learning, especially to obtain the predictive distribution of a Bayesian regression model.
For instance, when one knows that the distribution over the regressor’s output is a Gaussian

**Corollary 2 (Gaussian Regression).** *Let *

*Proof.*
First, notice that Gaussian is symmetric:

for

Thus, by Proposition 1, we have

## The Probit Integral and the Probit Approximation

*The probit function**error function*

by

**Proposition 3 (The Probit Integral).** *If *

*Proof.*
The standard property of the error function [2] says that

So,

This integral is very useful for Bayesian inference since it enables us to approximate the following integral that is ubiquitous in Bayesian binary classifications

where ** logistic function**.

The key idea is to notice that the probit and logistic function are both *sigmoid* functions.
That is, their graphs have a similar “S-shape”.
Moreover, their images are both

So, the strategy to approximate the integral above is as follows: (i) horizontally “contract” the probit function and then (ii) use Proposition 3 to get an analytic approximation to the integral.

For the first step, this can be done by a simple change of coordinate: stretch the domain of the probit function with a constant

**Corollary 4.** *If *

*Proof.*
By Proposition 3, we have

Now we are ready to obtain the final approximation, often called the ** probit approximation**.

**Proposition 5 (Probit Approximation)** *If *

*Proof.*
Let

Substituting

The probit approximation can also be used to obtain an approximation to the following integral, ubiquitous in multi-class classifications:

where the Gaussian is defined on

**Proposition 6 (Multiclass Probit Approximation; Gibbs, 1998).** *If *

*where the division in the r.h.s. is component-wise.*

*Proof.*
The proof is based on [3].
Notice that we can write the

Then, we use the following approximations (which admittedly might be quite loose):

,- the mean-field approximation
, and thus we have , and - using the probit approximation (Proposition 5), with a further approximation

we obtain

We identify the last equation above as the

## References

- Ng, Edward W., and Murray Geller. “A table of integrals of the error functions.”
*Journal of Research of the National Bureau of Standards B 73*, no. 1 (1969): 1-20. - Gibbs, Mark N.
*Bayesian Gaussian processes for regression and classification*. Dissertation, University of Cambridge, 1998. - Lu, Zhiyun, Eugene Ie, and Fei Sha. “Mean-Field Approximation to Gaussian-Softmax Integral with Application to Uncertainty Estimation.”
*arXiv preprint arXiv:2006.07584*(2020).