Mason Wang

Downsampling and Stretching

Lots of math here. You can reprove this by hand from Jensen’s inequality or look at your notes

Basically, start with log(p(x)). —Then express it using a latent variable model decomposition —choose an arbitrary q(z) as your ‘weighting’ int( log( q(z) * p(x,z)/q(z) ) dz) —apply Jensen’s inequality when you get to p(x,z) split it up into p(z|x) and p(x) that will let you take out p(x), and also give you a KL term

ELBO = log(p(x)) - KL(q(z), p(z x))) (this KL is assuming q is ‘ground truth’)

Maximizing p(x) with respect to the parameters for q(z) and p(z|x) involves expectation maximization, this means —can improve ELBO’s lower bound by changing p(z|x) slash p(x|z)’s parametrization OR —can make ELBO bound more tight by changing q(z)’s parametrization Last Reviewed: 1/19/25