Thursday, 20 August 2015

Caffe Tutorial - Part 1 - basic feedforward nets

Caffe is a deep learning framework often used for image recognition projects, but I've found that the tutorials aren't that great, or at least I have trouble finding solutions to the problems I have. So I've decided to write a sequence of tutorials to lay out information that I can revise in the future and that others might use as well.

The first tutorial (this one) will be a basic single hidden layer feed-forward neural net trained on the fisher iris dataset. I'll use MATLAB to write the feature files that caffe uses, for this tutorial we'll use layer type HDF5Data because HDF5 files are easy to write.

Also try this lenet tutorial as it covers some things that this page does not.

Writing the Data Files

This is a short matlab script for writing the train and test files, each file must contain both the features and labels. Other languages can be used e.g. python has a library for writing hdf5 files as well. Note that fisheriris is built in to matlab. When we load it the features are 150x4, when it gets written to the hdf5 files it has to be transposed (4x150). The labels come as a cell array of strings, we need to convert them to integers starting at 0. We randomly choose 100 samples to be in the train set and 50 to be in the test set.

We use innerProduct layers as these are fully connected feedforward layers, and they are initalised using the 'xavier' algorithm which is explained in this other blog post.

We now have 2 files: iris_train.hdf5 containing the train features and labels, and iris_test.hdf5 containing the test features and labels.

Specifying the Network

At this point I recommend reading a bit about layers, nets and blobs, you can also look at the types of layers available. Our network will be a single hidden layer with 20 nodes. The iris dataset has 3 classes, so our output layer will have 3 nodes. We use the rectified linear unit (ReLU) activation, and softmax outputs.

We are doing training and testing in this network, this is why there are 2 data layers, one with phase: TRAIN and one with phase: TEST. One annoying bit is that we can't directly specify the location of the HDF5 file, we have to specify the location of a txt file that contains as its first line the location of the HDF% file. So "./files/train_file_location.txt" that appears in the network specification actually has as its only contents:

./files/iris_train.hdf5

which is the location the matlab script saved the hdf5 files to.

At the end of this script, we have SoftmaxWithLoss as well as softmax layers. This is because we want to see the accuracy, and the loss can't be used with the accuracy. We pass the softmax outputs to the Accuracy layer so we can see how well it performs. Note that accuracy is only computed during phase: TEST.

Specifying the Solver

The solver is pretty standard.

Running Everything

Now you probably want to run it, just type the following into a terminal while being in the correct directory:

/path/to/install/caffe-master/build/tools/caffe train --solver=./iris_caffe_solver.prototxt