Mason Wang

MIDI-DDSP

Chaining several systems together improves controllability

Some people combine MIDI with traditional DSP, but this is hard to generate realistic timbre.

Vision has systems optimized for both realism and control

concatentative systems have realism, but manual stitching limits control and expression.

analysis: audio -> ddsp parameters -> performance -> notes

synthesis: notes -> performance -> ddsp parameters -> audio

composer usually writes notes, performer interprets, then instrument converts to sound. Notes, performance, synthesis.

three modules: ddsp synthesizer, synthesis param generator, and expression generator.

three fixed feature extractions: ddsp synthesis, feature extraction, note detection

requires pitch detection/note detection, so limited to single monophonic instruments.

train on > 12 instruments with a single model, conditional generation on instruments for every stage.

Contributions

Method

Skipped

Related work

Last Reviewed 10/8/25