Two-stage single-channel audio source separation using deep neural networks

Emad M Grais, Gerard Roma, Andrew JR Simpson, Mark D Plumbley

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

Most single channel audio source separation approaches produce separated sources accompanied by interference from other sources and other distortions. To tackle this problem, we propose to separate the sources in two stages. In the first stage, the sources are separated from the mixed signal. In the second stage, the interference between the separated sources and the distortions are reduced using deep neural networks (DNNs). We propose two methods that use DNNs to improve the quality of the separated sources in the second stage. In the first method, each separated source is improved individually using its own trained DNN, while in the second method all the separated sources are improved together using a single DNN. To further improve the quality of the separated sources, the DNNs in the second stage are trained discriminatively to further decrease the interference and the distortions of the separated sources. Our experimental results show that using two stages of separation improves the quality of the separated signals by decreasing the interference between the separated sources and distortions compared to separating the sources using a single stage of separation.
LanguageEnglish
Pages1773-1783
Number of pages11
JournalIEEE/ACM Transactions on Audio, Speech, and Language Processing
Volume25
Issue number9
DOIs
Publication statusPublished - 1 Sep 2017
Externally publishedYes

Fingerprint

Source separation
Deep neural networks

Cite this

Grais, Emad M ; Roma, Gerard ; Simpson, Andrew JR ; Plumbley, Mark D. / Two-stage single-channel audio source separation using deep neural networks. In: IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2017 ; Vol. 25, No. 9. pp. 1773-1783.
@article{d80218a134b34359af718cb1bebe33e1,
title = "Two-stage single-channel audio source separation using deep neural networks",
abstract = "Most single channel audio source separation approaches produce separated sources accompanied by interference from other sources and other distortions. To tackle this problem, we propose to separate the sources in two stages. In the first stage, the sources are separated from the mixed signal. In the second stage, the interference between the separated sources and the distortions are reduced using deep neural networks (DNNs). We propose two methods that use DNNs to improve the quality of the separated sources in the second stage. In the first method, each separated source is improved individually using its own trained DNN, while in the second method all the separated sources are improved together using a single DNN. To further improve the quality of the separated sources, the DNNs in the second stage are trained discriminatively to further decrease the interference and the distortions of the separated sources. Our experimental results show that using two stages of separation improves the quality of the separated signals by decreasing the interference between the separated sources and distortions compared to separating the sources using a single stage of separation.",
author = "Grais, {Emad M} and Gerard Roma and Simpson, {Andrew JR} and Plumbley, {Mark D}",
year = "2017",
month = "9",
day = "1",
doi = "10.1109/TASLP.2017.2716443",
language = "English",
volume = "25",
pages = "1773--1783",
journal = "IEEE/ACM Transactions on Audio, Speech, and Language Processing",
publisher = "IEEE",
number = "9",

}

Two-stage single-channel audio source separation using deep neural networks. / Grais, Emad M; Roma, Gerard; Simpson, Andrew JR; Plumbley, Mark D.

In: IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 25, No. 9, 01.09.2017, p. 1773-1783.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Two-stage single-channel audio source separation using deep neural networks

AU - Grais, Emad M

AU - Roma, Gerard

AU - Simpson, Andrew JR

AU - Plumbley, Mark D

PY - 2017/9/1

Y1 - 2017/9/1

N2 - Most single channel audio source separation approaches produce separated sources accompanied by interference from other sources and other distortions. To tackle this problem, we propose to separate the sources in two stages. In the first stage, the sources are separated from the mixed signal. In the second stage, the interference between the separated sources and the distortions are reduced using deep neural networks (DNNs). We propose two methods that use DNNs to improve the quality of the separated sources in the second stage. In the first method, each separated source is improved individually using its own trained DNN, while in the second method all the separated sources are improved together using a single DNN. To further improve the quality of the separated sources, the DNNs in the second stage are trained discriminatively to further decrease the interference and the distortions of the separated sources. Our experimental results show that using two stages of separation improves the quality of the separated signals by decreasing the interference between the separated sources and distortions compared to separating the sources using a single stage of separation.

AB - Most single channel audio source separation approaches produce separated sources accompanied by interference from other sources and other distortions. To tackle this problem, we propose to separate the sources in two stages. In the first stage, the sources are separated from the mixed signal. In the second stage, the interference between the separated sources and the distortions are reduced using deep neural networks (DNNs). We propose two methods that use DNNs to improve the quality of the separated sources in the second stage. In the first method, each separated source is improved individually using its own trained DNN, while in the second method all the separated sources are improved together using a single DNN. To further improve the quality of the separated sources, the DNNs in the second stage are trained discriminatively to further decrease the interference and the distortions of the separated sources. Our experimental results show that using two stages of separation improves the quality of the separated signals by decreasing the interference between the separated sources and distortions compared to separating the sources using a single stage of separation.

U2 - 10.1109/TASLP.2017.2716443

DO - 10.1109/TASLP.2017.2716443

M3 - Article

VL - 25

SP - 1773

EP - 1783

JO - IEEE/ACM Transactions on Audio, Speech, and Language Processing

T2 - IEEE/ACM Transactions on Audio, Speech, and Language Processing

JF - IEEE/ACM Transactions on Audio, Speech, and Language Processing

IS - 9

ER -