## Thursday, 27 November 2014

### Reading and writing NIST, RAW and WAV files in MATLAB

To open files (NIST or WAV) when you are not sure which it could be, use audioread.m, which depends on the read_X_file.m explained below.

# NIST files

NIST files are very common when doing speech processing, for example the TIMIT and RM1 speech databases are in NIST format. The NIST format consists of 1024 bytes at the start of the file consisting of ASCII text, after this header the speech data follows. For TIMIT and RM1 the speech data is 16-bit signed integers which can be read directly from the file and treated as a signal.

To get the sampling frequency, you'll have to parse the text that forms the header. In any case, the functions to read and write NIST files is supplied here: read_NIST_file.m and write_NIST_file.m.

Note that writing a NIST file using the scripts above requires a header. The easiest way to get a header is to read another NIST file. So if you want to modify a NIST file then you would use something like:

[signal, fs, header] = read_NIST_file('/path/to/file.wav');


This reuses the header from previously and works fine. If you want to create completely new files i.e. there is no header to copy, I recommend not creating NIST formatted files, create wav files instead as they are far better supported by everything.

An example header from the timit file timit/test/dr3/mrtk0/sx283.wav. Note the magic numbers NIST_1A as the first 7 bytes of the file. The actual audio data starts at byte 1024, the rest of the space between the end of the header text and byte 1024 is just newline characters.

NIST_1A
1024
database_id -s5 TIMIT
database_version -s3 1.0
utterance_id -s10 rtk0_sx283
channel_count -i 1
sample_count -i 50791
sample_rate -i 16000
sample_min -i -2780
sample_max -i 4675
sample_n_bytes -i 2
sample_byte_format -s2 01
sample_sig_bits -i 16


# RAW files

RAW files are just pure audio data without a header. This can make it a little difficult to figure out what is actually in them, often you may just have to try things until you get meaningful data coming out. Common settings would be 16-bit signed integer samples. You'll also have to take care of endianness, if the files were written on a windows machine all the integers will be byte-swapped and you'll have to swap them back.