Abstract
Most single channel audio source separation approaches produce separated sources accompanied by interference from other sources and other distortions. To tackle this problem, we propose to separate the sources in two stages. In the first stage, the sources are separated from the mixed signal. In the second stage, the interference between the separated sources and the distortions are reduced using deep neural networks (DNNs). We propose two methods that use DNNs to improve the quality of the separated sources in the second stage. In the first method, each separated source is improved individually using its own trained DNN, while in the second method all the separated sources are improved together using a single DNN. To further improve the quality of the separated sources, the DNNs in the second stage are trained discriminatively to further decrease the interference and the distortions of the separated sources. Our experimental results show that using two stages of separation improves the quality of the separated signals by decreasing the interference between the separated sources and distortions compared to separating the sources using a single stage of separation.
| Original language | English |
|---|---|
| Pages (from-to) | 1773-1783 |
| Number of pages | 11 |
| Journal | IEEE/ACM Transactions on Audio, Speech, and Language Processing |
| Volume | 25 |
| Issue number | 9 |
| DOIs | |
| Publication status | Published - 1 Sept 2017 |
| Externally published | Yes |
Fingerprint
Dive into the research topics of 'Two-stage single-channel audio source separation using deep neural networks'. Together they form a unique fingerprint.Research output
- 46 Citations
- 2 Conference contribution
-
Improving single-network single-channel separation of musical audio with convolutional layers
Roma, G., Green, O. & Tremblay, P. A., 6 Jun 2018, Latent Variable Analysis and Signal Separation: 14th International Conference, LVA/ICA 2018, Guildford, UK, July 2–5, 2018, Proceedings. Gannot, S., Deville, Y., Mason, R., Plumbley, M. D. & Ward, D. (eds.). Springer Verlag, p. 306-315 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 10891 LNCS).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
Open AccessFile6 Link opens in a new tab Citations (Scopus) -
Combining mask estimates for single channel audio source separation using deep neural networks
Grais, E. M., Roma, G., Simpson, A. J. & Plumbley, M. D., Sept 2016, Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. p. 3339-3343 5 p.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
Open Access23 Link opens in a new tab Citations (Scopus)
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver