The official demos of paper:
Demos - The best performance we get with larger training set
Model:
MDN_K=4 (0.27 million parameters)
Unet-5_K=4 (13.3 million parameters)
Demos - The experiments in our paper
Genre:
This paper presents a new input format, channel-wise subband input (CWS), for convolutional neural network (CNN) based music source separation (MSS) models in the frequency domain. We aim to address the major issues in CNN-based high-resolution MSS model: high computational cost and weight sharing between distinctly different bands. Specifically in this paper, we decompose the input mixture spectra into several bands and concatenate them channel-wise as the model input. The proposed approach enables effective weight sharing in each subband and introduces more flexibility between channels. For comparison purposes, we perform voice and accompaniment separation (VAS) on models with different scales, architectures, and CWS settings. The result shows that the CWS input is beneficial in many aspects. Among all our experiments, it enables models to obtain a 6.9% performance gain on average. With even a smaller number of parameters, much smaller training data, and shorter training time, our MDenseNet with 8-bands CWS input still surpasses the original MMDenseNet with a large margin. CWS also reduces computational cost and training time to a large extent, which can considerably expedite the experiment process.
We open-source our code on github!
For users from Main Land China, you can visit this site for better user experience.
For users outside Main Land China, I recommend you to visit this site .
Please allow up to two minutes for the audios to get ready, thanks for your patience!
We trained our model on additional data (Compared with MUSDB only) with 35.18 hours of pure vocal and 279.87 hours of pure music. Each of these experiments take approximately five days on a single GTX 1080Ti GPU. Finally we reach the following result.
Due to the copyright issues, we can not present full length of each mixture. But we offer links for each song. Check the "" button behind each song to get access to the mixture.
Click on the music you'd like to listen and you will see a full list of experiment results.
Note that I'm not professional in music. I just categorise these songs by instinct.
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Mixture:
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Mixture:
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Mixture:
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Mixture:
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Mixture:
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Mixture:
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Mixture:
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Mixture:
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Mixture:
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Facebook-Demucs also separate the following three songs for a demo. Their result are presented on this page
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |
Model | Accompaniments | Vocal |
---|---|---|
UNET-5 | ||
UNET-5_K=2 | ||
UNET-5_K=4 | ||
UNET-5_K=8 | ||
MMDN | ||
MDN | ||
MDN_K=2 | ||
MDN_K=4 | ||
MDN_K=8 | ||
UNET-6 | ||
UNET-6_K=2 | ||
UNET-6_K=4 | ||
UNET-6_K=8 | ||
BD-UNET-6 | ||
BD-UNET-6_K=2 | ||
BD-UNET-6_K=4 | ||
BD-UNET-6_K=8 |