Singing voice separation using deep neural networks and f0 estimation

Gerard Roma, Emad M Grais, Andrew JR Simpson, Mark D Plumbley

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deep Neural Networks (DNN) have become a popular approach for speech enhancement, and singing voice separation. DNNs are typically trained to estimate a time frequency mask using ground truth examples. In this submission, we combine DNN estimation as a first step with traditional refinement via F0 estimation, using the YINFFT algorithm.
Original languageEnglish
Title of host publicationMIREX 2016
Number of pages2
Publication statusPublished - 2016
Externally publishedYes
Event12th Music Information Retrieval Evaluation Exchange - New York City, United States
Duration: 7 Aug 201612 Aug 2016
Conference number: 12
http://www.music-ir.org/mirex/wiki/2016:Main_Page (Link to Conference Website)

Conference

Conference12th Music Information Retrieval Evaluation Exchange
Abbreviated titleMIREX 2016
CountryUnited States
CityNew York City
Period7/08/1612/08/16
Internet address

Fingerprint

Speech enhancement
Masks
Deep neural networks

Cite this

Roma, G., Grais, E. M., Simpson, A. JR., & Plumbley, M. D. (2016). Singing voice separation using deep neural networks and f0 estimation. In MIREX 2016 MIREX Google Scholar
Roma, Gerard ; Grais, Emad M ; Simpson, Andrew JR ; Plumbley, Mark D. / Singing voice separation using deep neural networks and f0 estimation. MIREX 2016. 2016. (MIREX Google Scholar).
@inproceedings{d27f968126d74d669e2d65d5885239e1,
title = "Singing voice separation using deep neural networks and f0 estimation",
abstract = "Deep Neural Networks (DNN) have become a popular approach for speech enhancement, and singing voice separation. DNNs are typically trained to estimate a time frequency mask using ground truth examples. In this submission, we combine DNN estimation as a first step with traditional refinement via F0 estimation, using the YINFFT algorithm.",
author = "Gerard Roma and Grais, {Emad M} and Simpson, {Andrew JR} and Plumbley, {Mark D}",
year = "2016",
language = "English",
booktitle = "MIREX 2016",

}

Roma, G, Grais, EM, Simpson, AJR & Plumbley, MD 2016, Singing voice separation using deep neural networks and f0 estimation. in MIREX 2016. MIREX Google Scholar, 12th Music Information Retrieval Evaluation Exchange, New York City, United States, 7/08/16.

Singing voice separation using deep neural networks and f0 estimation. / Roma, Gerard; Grais, Emad M; Simpson, Andrew JR; Plumbley, Mark D.

MIREX 2016. 2016. (MIREX Google Scholar).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Singing voice separation using deep neural networks and f0 estimation

AU - Roma, Gerard

AU - Grais, Emad M

AU - Simpson, Andrew JR

AU - Plumbley, Mark D

PY - 2016

Y1 - 2016

N2 - Deep Neural Networks (DNN) have become a popular approach for speech enhancement, and singing voice separation. DNNs are typically trained to estimate a time frequency mask using ground truth examples. In this submission, we combine DNN estimation as a first step with traditional refinement via F0 estimation, using the YINFFT algorithm.

AB - Deep Neural Networks (DNN) have become a popular approach for speech enhancement, and singing voice separation. DNNs are typically trained to estimate a time frequency mask using ground truth examples. In this submission, we combine DNN estimation as a first step with traditional refinement via F0 estimation, using the YINFFT algorithm.

UR - http://www.music-ir.org/mirex/wiki/2016:Singing_Voice_Separation_Results

M3 - Conference contribution

BT - MIREX 2016

ER -

Roma G, Grais EM, Simpson AJR, Plumbley MD. Singing voice separation using deep neural networks and f0 estimation. In MIREX 2016. 2016. (MIREX Google Scholar).