Multi-Source Diffusion Models

audio samples are sum of individual sources share context

joint does not factorize into marginal - sources are dependent but, knowing joint implies distributions over mixtures, via marginalization.

knowing the distribution over sources -> joint is harder than joint -> sources

compositional musical generation highly connected to source separation - need to separate out sources, then compose them together.

source separation models either

learn a single model for each source distribution, and condition on the mixture during inference.
target the conditional distribution

They

to do: read beyond intro

Last Reviewed: 10/9/25