Mason Wang

An Interpretation of KL Divergence

Imagine a lottery game. Let $X$ be a random variable describing the outcome of the lottery game, and $x$ be a realization of that random variable.

Now, let’s talk about what you do as a player:

Your expected log-winnings are:

\[\sum_x\left[ p(x) \log \left(\frac{p(x)}{q(x)} \right)\right]\]

This is actually the formula for KL-divergence.

In other words, $D_{\text{KL}(p,q)}$ is the maximum amount of log-money that can be made off one dollar, when the payoffs are assigned by the distribution $q$, but the real distribution is $p$.

To Do: review other interpretations.

Source

Last Reviewed: 1/20/25