Saturday, 29 November 2014

Adding Noise of a certain SNR to audio files

A common task when dealing with audio is to add noise to files, e.g. if you want to test the performance of a speech recognition system in the presence of noise. This is based on computing the Signal to Noise Ratio (SNR) of the speech vs. noise. To compute the energy in a speech file, just add up the sum of squares of all the samples:

$E_{Speech} = \sum_{i=0}^N s(i)^2$

where $$s(i)$$ is the vector of speech samples you read with a function like wavread. We will also need some noise, which we can generate using a function like randn(N,1) where N is the length of the speech signal. Alternatively we can use a dedicated noise file containing e.g. babble noise and just truncate it at the correct length. When using a noise file, it is important to randomise the start position for each file so you don't always have e.g. a door banging or a guy laughing at the same point in every file. This can mess with classifiers. Anyway, now compute the energy of the noise:

$E_{Noise} = \sum_{i=0}^N n(i)^2$

where $$n(i)$$ is the noise vector. To compute the SNR of a speech file compared to noise:

$SNR = 10\log_{10} \left( \dfrac{E_{Speech}}{E_{Noise}} \right)$

If you don't have the pure noise, you just have a corrupted version of the original, you compute the noise as: $$n(i) = x(i) - s(i)$$, where $$x(i)$$ is the corrupted signal.

Now we want to scale the noise by a certain amount and add it to the original speech signal so that the SNR is correct. This assumes we have a target SNR, for the sake of this post assume we want the noise to be at 20dB SNR. We now use the following formula (This formula assumes the noise signal has unit variance, you may need to normalise it before using this formula):

$K = \sqrt{ \dfrac{E_{Speech}}{10^{20\text{dB}/10}} }$

Once we have done this we need to create $$\hat{n}(i) = K\times n(i)$$ for our noise samples. Our noisy speech file is calculated as:

$x(i) = s(i) + \hat{n}(i)$

for $$i = 1 \ldots N$$. You should be able to compute the SNR between the new noisy signal and the original signal and it should come out to be very close to 20dB (it could be 19.9 or 20.1 or something). Function for computing snr in matlab: snr.m, function for adding white noise to files: addnoise.m.