Thursday 27 November 2014

Generating Pretty Spectrograms in MATLAB

Spectrograms are a time-frequency representation of speech (or any other) signals. It can be difficult to make them pretty, as there are a lot of settings that change various properties. This post will supply some code for generating spectrograms in MATLAB, along with an explanation of all the settings that affect the final spectrogram.

The file itself can be found here: myspectrogram.m.

An example of what it can do:

We will now look at some of the default settings. To call it all you need to do is: myspectrogram(signal,fs);. This assumes signal is a sequence of floats that represent the time domain sequence of an audio file, and fs is the sampling frequency. fs is used to display the time (in seconds) at the bottom of the spectrogram.

There are a lot more settings, the full call with everything included is:

[handle] = myspectrogram(s, fs, nfft, T, win, Slim, alpha, cmap, cbar);

s - speech signal
fs - sampling frequency
nfft - fft analysis length, default 1024
T - vector of frame width and frame shift (ms), i.e. [Tw, Ts], default [18,1] in ms
w - analysis window handle, default @hamming
Slim - vector of spectrogram limits (dB), i.e. [Smin Smax], default [-45, -2]
alpha - fir pre-emphasis filter coefficients, default false (no preemphasis)
cmap - color map, default 'default'
cbar - color bar (boolean), default false (no colorbar)

nfft is the number of points used in the FFT, larger values of nfft will have more detail, but there will be diminishing returns. 1024 is a good value.

The frame lengths and shifts can be important, shorter window lengths give better time resolution but poorer frequency resolution. Long window lengths give better frequency resolution but poorer time resolution. By playing with these numbers you can get a better idea of how they work.

The window function is specified as an inline function. @hamming is the MATLAB hamming function. If you want blackman you would use @blackman. For a parameterised window like the Chebyshev, you can use @(x)chebwin(x,30) using 30dB as the chebyshev parameter. For a rectangular window you can use @(x)ones(x,1).

The vector of spectrogram limits clips the spectrogram at these points. First the highest point of the log spectrogram is set to 0, the everything outside the limits is set to the limits. This can make spectrograms look much cleaner if there is noise below e.g. -50dB, which is not audible but makes the spectrogram look messy. You can remove it with the limits.



  1. Thanks, James!
    Works perfectly.
    Could you please give a hint how to adjust frequency range? For example, frequency range of my sample from 0 to 20kHz, but I neer to see just from 0.1 to 0.5Hz.

    1. are you sure you mean 0.1 to 0.5Hz? that is very low. To restrict the range between e.g. 100 and 500 Hz, use ylim([100,500]) after calling myspectrogram.