Let ** Fisher information** is defined by

where

## The Fisher Information under Sufficient Statistics

Let ** sufficient statistic** for the parameter

The following proposition shows the behavior of

**Proposition 1.** *The Fisher information is invariant under sufficient statistics.*

*Proof.* Let

So, the Fisher information

We conclude that

## The Fisher Information as a Riemannian Metric

Let

be the set of the parametric densities

Let us assume that

**Proposition 2.** _The component functions

*Proof.* Let

where the second equality follows from the standard chain rule. We conclude that

## Chentsov’s Theorem

The previous two results are useful since the Fisher information metric is invariant under sufficient statistics. In this sense,

Here, we shall see a stronger statement, due to Chentsov in 1972, about the Fisher metric: It is the *unique* statistically-invariant metric for

Originally, Chentsov’s theorem is described on the space of Categorical probability distributions over the sample space *Markov embeddings*.

Let

That is, the

We call this map a ** Markov embedding**. The name suggests that

The result of Campbell (1986) characterizes the form of the Riemannian metric in

**Lemma 3 (Campbell, 1986).** _Let

_where

*Proof.* See Campbell (1986) and Amari (2016, Sec. 3.5).

Lemma 3 is a general statement about the invariant metric in *probability simplex*

The fact that the Fisher information is the unique invariant metric under sufficient statistics follows from the fact that when

Let us, therefore, connect the result in Lemma 3 with the Fisher information on

**Lemma 4.** _The Fisher information of a Categorical distribution

*That is, *

*Proof.* By definition,

where we assume that

for each

Using similar step, we can show that

Now we are ready to state the main result.

**Theorem 5 (Chentsov, 1972).** *The Fisher information is the unique Riemannian metric on *

*Proof.* By Lemma 3, the invariant metric under Markov embeddings in

for any

Let us therefore restrict

Moreover, if

Therefore

Recalling that

Generalizations to this (original) version Chentsov’s theorem exists. For instance, Ay et al. (2015) showed Chentsov’s theorem for arbitrary, parametric probability distributions. Dowty (2018) stated Chentsov’s theorem for exponential family distributions.

## References

- Chentsov, N. N. “Statistical Decision Rules and Optimal Deductions.” (1972).
- Campbell, L. Lorne. “An extended Čencov characterization of the information metric.” Proceedings of the American Mathematical Society 98, no. 1 (1986): 135-141.
- Amari, Shun-ichi. Information geometry and its applications. Vol. 194. Springer, 2016.
- Ay, Nihat, Jürgen Jost, Hông Vân Lê, and Lorenz Schwachhöfer. “Information geometry and sufficient statistics.” Probability Theory and Related Fields 162, no. 1-2 (2015): 327-364.
- Dowty, James G. “Chentsov’s theorem for exponential families.” Information Geometry 1, no. 1 (2018): 117-135.