InstrumentGen Pipeline Reproduction
SEP–DEC 2024 · DAC + CLAP + MUSICGEN · COURSE PROJECT · audio · research
A reproduction of an instrument-conditioned audio generation pipeline: a DAC (digital audio codec) producing audio features, fed into CLAP (Contrastive Language-Audio Pretraining) embeddings, routed into MusicGen for synthesis. End to end, from audio conditioning to generated waveform.
Reproducing a paper is a specific kind of work. You don't get to invent; you have to follow. But following surfaces two things that reading doesn't: which details the authors elided because they considered them obvious, and which details they didn't realize mattered. Both categories become legible only once the pipeline is running end to end on your own machine. That's the value. The final output is a working run; the real product is an understanding of what the original authors knew that they didn't write down.