Skip to Main content Skip to Navigation
New interface
Journal articles

Perceptually controlled doping for audio source separation

Abstract : The separation of an underdetermined audio mixture can be performed through sparse component analysis (SCA) that relies however on the strong hypothesis that source signals are sparse in some domain. To overcome this difficulty in the case where the original sources are available before the mixing process, the informed source separation (ISS) embeds in the mixture a watermark, which information can help a further separation. Though powerful, this technique is generally specific to a particular mixing setup and may be compromised by an additional bitrate compression stage. Thus, instead of watermarking, we propose a 'doping' method that makes the time-frequency representation of each source more sparse, while preserving its audio quality. This method is based on an iterative decrease of the distance between the distribution of the signal and a target sparse distribution, under a perceptual constraint. We aim to show that the proposed approach is robust to audio coding and that the use of the sparsified signals improves the source separation, in comparison with the original sources. In this work, the analysis is made only in instantaneous mixtures and focused on voice sources.
Document type :
Journal articles
Complete list of metadata

Cited literature [43 references]  Display  Hide  Download
Contributor : Gaël Mahé Connect in order to contact the contributor
Submitted on : Friday, July 13, 2018 - 5:33:44 PM
Last modification on : Friday, August 5, 2022 - 11:40:57 AM
Long-term archiving on: : Monday, October 15, 2018 - 1:30:34 PM


Publisher files allowed on an open archive


  • HAL Id : hal-01839081, version 1



Gaël Mahé, Everton Z Nadalin, Ricardo Suyama, Joao M. T. Romano. Perceptually controlled doping for audio source separation. EURASIP Journal on Advances in Signal Processing, 2014, 2014, pp.27 - 27. ⟨hal-01839081⟩



Record views


Files downloads