Single Channel Source Separation Bernie C. Till

 

Jang, G, & Lee, T, 2003: A Maximum Likelihood Approach to Single-Channel Source Separation, Journal of Machine Learning Research vol 4, pp 1365-1392
This paper presents a new technique for achieving blind signal separation when given only a single channel recording. The main concept is based on exploiting a priori sets of time-domain basis functions learned by independent component analysis (ICA) to the separation of mixed source signals observed in a single channel. The inherent time structure of sound sources is reflected in the ICA basis functions, which encode the sources in a statistically efficient manner. We derive a learning algorithm using a maximum likelihood approach given the observed single channel data and sets of basis functions. For each time point we infer the source parameters and their contribution factors. This inference is possible due to prior knowledge of the basis functions and the associated coefficient densities. A flexible model for density estimation allows accurate modeling of the observation and our experimental results exhibit a high level of separation performance for simulated mixtures as well as real environment recordings employing mixtures of two different sources.
Jang, G, & Lee, T, 2002: A Probabilistic Approach to Single Channel Blind Signal Separation, Proc NIPS'02
We present a new technique for achieving source separation when given only a single channel recording. The main idea is based on exploiting the inherent time structure of sound sources by learning a priori sets of basis filters in time domain that encode the sources in a statistically efficient manner. We derive a learning algorithm using a maximum likelihood approach given the observed single channel data and sets of basis filters. For each time point we infer the source signals and their contribution factors. This inference is possible due to the prior knowledge of the basis filters and the associated coefficient densities. A flexible model for density estimation allows accurate modeling of the observation and our experimental results exhibit a high level of separation performance for mixtures of two music signals as well as the separation of two voice signals.
Jang, G, Lee, T, & Ho, Y, 2003: A Subspace Approach to Single Channel Signal Separation using Maximum Likelihood Weighting Filters, Proc ICASSP'03, vol 5, pp 45-48
The goal of this work is to extract multiple source signals when only a single channel observation is available. We propose a new signal separation algorithm based on a subspace decomposition. The observation is transformed into subspaces of interest with different sets of basis functions. A flexible model for density estimation allows an accurate modeling of the distributions of the source signals in the subspaces, and we develop a filtering technique using a maximum likelihood (ML) approach to match the observed single channel data with the decomposition. Our experimental results show good separation performance on simulated mixtures of two music signals as well as two voice signals.
Jang, G, Lee, T, & Ho, Y, 2001: Blind Separation of Single Channel Mixture using ICA Basis Functions, Proc 3rd Int'l Conf ICA & BSS, ICA'01
A new technique has been developed to enable blind source separation given only a single channel recording. The proposed method infers source signals and their contribution factors at each time point by a number of adaptation steps maximizing log-likelihood of the estimated source parameters given the observed single channel data and sets of basis functions. This inferencing is possible due to the prior information on the inherent time structure of the sound sources by learning a priori sets of time-domain basis functions and the associated coefficient densities that encode the sources in a statistically efficient manner. A flexible model for density estimation allows accurate modeling of the observation and our experimental results show close-to-perfect separation on simulated mixtures as well as recordings in a real environment employing mixtures of two different sources.
Hochreiter, S, & Mozer, M C: Monaural Separation and Classification of Mixed Signals - A Support-Vector Regression Perspective
We address the problem of extracting multiple independent sources from a single mixture signal. Standard independent component analysis approaches fail when the number of sources is greater than the number of mixtures. For this case, the sparse-decomposition method has been proposed. The method relies on a dictionary of atomic signals and recovers the degree to which various dictionary atoms are present in the mixture. We show that the sparse-decomposition method is equivalent to a form of support-vector regression (SVR). The training inputs for the SVR are the dictionary atoms, and the corresponding targets are the dot product of the mixture and atom vectors. The SVR perspective provides a new interpretation of the sparse-decomposition method's hyperparameter, and allows us to generalize and improve the method. The most important insight is that the sources do not have to be identical to dictionary atoms, but rather we can accommodate a many-to-one mapping of source signals to dictionary atoms - a classification of sorts - characterized by a known nonlinear transformation with unknown parameters. The limitation of the SVR perspective is that it cannot recover the signal strength of an atom in the mixture; rather, it can only recover whether or not a particular atom was present. In experiments, we show that our model can handle difficult problems involving classification of sources. Our model may be particularly useful for speech signal processing and CDMA-based mobile communication, where in both cases we have knowledge about the invariances in the signal.
Cauwenberghs, G: 1999: Monaural Separation of Independent Acoustical Components, Proc ISCAS'99
The problem of blindly separating signal mixtures with fewer mixture components than independent signal sources is mathematically ill-defined, and requires suitable prior information on the nature of the sources. Recently, it has been shown that sparse methods for function approximation using a Laplacian prior can be effective, but the method fails to separate a single mixture without further prior information. Other techniques track harmonics, but assume separability in the time-frequency domain. We show that a measure of temporal and spectral coherence provides an effective cue for separating independent acoustical or sonar sources, in the absence of spatial cues in the monaural case. The technique is shown to successfully separate single mixtures of sources with significant spectral overlap.
Roweis, S T: One Microphone Source Separation
Source separation, or computational auditory scene analysis, attempts to extract individual acoustic objects from input which contains a mixture of sounds from different sources, altered by the acoustic environment. Unmixing algorithms such as ICA and its extensions recover sources by reweighting multiple observation sequences, and thus cannot operate when only a single observation signal is available. I present a technique called refiltering which recovers sources by a nonstationary reweighting ("masking") of frequency sub-bands from a single recording, and argue for the application of statistical algorithms to learning this masking function. I present results of a simple factorial HMM system which learns on recordings of single speakers and can then separate mixtures using only one observation signal by computing the masking function and then refiltering.
Jang, G, Lee, T, & Ho, Y, 2003: Single Channel Signal Separation Using MAP-based Subspace Decomposition
An algorithm for single channel signal separation is presented. The algorithm projects the observed signal to given subspaces, and recovers the original sources by probabilistic weighting and recombining the subspace signals. The results of separating mixtures of two different natural sounds are reported.
Jang, G, Lee, T, & Ho, Y, 2003: Single Channel Signal Separation Using Maximum Likelihood Subspace Projection, Proc 3rd Int'l Conf ICA & BSS, ICA'03
This paper presents a technique for extracting multiple source signals when only a single channel observation is available. The proposed separation algorithm is based on a subspace decomposition. The observation is pro- jected onto subspaces of interest with different sets of basis functions, and the original sources are obtained by weighted sums of the projections. A flexible model for density estimation allows an accurate modeling of the distributions of the source signals in the subspaces, and we develop a filtering technique using a maximum likelihood (ML) approach to match the observed single channel data with the decomposition. Our experimental results show good separation performance on simulated mixtures of two music signals as well as two voice signals.
Jang, G, Lee, T, & Ho, Y, 2003: Single Channel Signal Separation Using Time-Domain Basis Functions, IEEE Signal Processing Letters
We present a new technique for achieving blind source separation when given only a single channel recording. The main idea is based on exploiting the inherent time structure of sound sources by learning a priori sets of time-domain basis functions that encode the sources in a statistically efficient manner. We derive a learning algorithm using a maximum likelihood approach given the observed single channel data and sets of basis functions. For each time point we infer the source parameters and their contribution factors using a flexible but simple density model. We show separation results of two music signals as well as the separation of two voice signals.