Tuesday, 7 February 2017

Solving XOR problem using tiny-dnn

Solving XOR problem using tiny-dnn

REVISED August 13th 2017

Almost all mature deep neural network (DNN) libraries e.g. Tensor Flow, Theano, Caffe, and etc are written in python, not in C/C++. We will hardly find library for DNN written in C/C++. Even if we find one, it requires heavy resources. Fortunately, we now have tiny-dnn. tiny-dnn is a C++11 implementation of deep learning. Nothing needs to be compiled, header only. It is suitable for deep learning on limited computational resource, embedded systems and IoT devices.

For new tiny-dnn user, it may hard to get used with the environment because the examples provided are directly designated to solve MNIST or CIFAR problem. On this post, I try to give example to solve simple problem (XOR) using tiny-dnn. It may sound excessive to use DNN framework only to solve XOR problem,. But for the sake of better understanding of framework structure, I think it’s okay to do so.

In this example, we will separate the train and test routines. The train routine will fit input against output and save the trained network into binary file. Then the test routine will read the trained network binary file and use it to yield result.

  • Step 0 : Preparing workspace
    Assume we work on directory dnn-dir that we will create

    mkdir dnn-dir
    cd dnn-dir
    touch train.cc
    touch test.cc
  • Step 1 : Cloning tiny-dnn
    First, we need too clone tiny-dnn from github :

    git clone https://github.com/tiny-dnn/tiny-dnn.git
  • At this step, your folder structure should be like following :
    - dnn-dir
    |--- tiny-dnn
    |--- train.cc
    |--- test.cc

  • Step 2 : defining problem
    Let’s define our XOR problem as follows :

    X Y X Y
    0 0 0
    0 1 1
    1 0 1
    1 1 0
  • Step 3 : creating trainer routine
    Create our data trainer inside train.cc

#include "tiny_dnn/tiny_dnn.h"

using namespace tiny_dnn;
using namespace std;

int main(int argc, char *argv[]){
    network<sequential> net;
    net << layers::fc(2,3) << activation::tanh()
        << layers::fc(3,1) << activation::tanh();

    vector<vec_t> input_data  { {0,0}, {0,1}, {1,0}, {1,1} };
    vector<vec_t> desired_out {   {0},   {1}, {1},     {0} };

    size_t batch_size = 1;
    size_t epochs = 1000;
    gradient_descent opt;
    net.fit<mse>(opt, input_data, desired_out, batch_size, epochs);

    double loss = net.get_loss<mse>(input_data, desired_out);
    cout << "mse : " << loss << endl;
    net.save("xor_net");
    return 0;
}

Just like the developer said, we don’t need to compile the library. All we need to do is including the library on our program (#include "tiny_dnn/tiny_dnn.h"). To simplify the program, we will use namespace tiny_dnn and std as usual.

Defining new network requires declaration using type tiny_dnn::network followed by the type of network (e.g. <sequential>) and the network name (net). Then our network layers and perceptron unit numbers per layer need to be configured. tiny-dnn provide several type of layer i.e. convolution net, average pooling net, fully connected net, or just simple multi layer perceptron. For our example which is multi layer perceptron, we can use fully connected net as follows :


    net << layers::fc(2,3) << activation::tanh()
        << layers::fc(3,1) << activation::tanh();

The script above defines an input layer with two inputs, one hidden layer with three perceptron and tan hyperbolic activation function, and an output layer with single output. Our network should look like this :

vector<vec_t> input_data  { {0,0}, {0,1}, {1,0}, {1,1} };
vector<vec_t> desired_out {   {0},   {1}, {1},     {0} };

tiny-dnn define new type for float array/vector called vec_t. We are gonna use it for place-holder of our XOR data.

size_t batch_size = 1;
size_t epochs = 1000;
gradient_descent opt;
net.fit<mse>(opt, input_data, desired_out, batch_size, epochs);

We could adjust the training parameter by assigning value to variable batch_size and epochs and choosing optimizer algorithm (e.g. gradient_descent, RMSprop, adam, etc.). Normally, we iterate over 1000 epochs and use gradient_descent algorithm. Then, we can train our net using command fit. Because of our problem is fitting problem, we use fit. But, if our problem is classification problem, we need train command.

double loss = net.get_loss<mse>(input_data, desired_out);
cout << "mse : " << loss << endl;
net.save("xor_net");

Finally, we get our result. We may know how good our net is by get_loss function. We are also able to save the trained network using save function.

  • Step 4 : creating test routine
    Create our test routine inside test.cc
#include "tiny_dnn/tiny_dnn.h"
#include <cstdlib>

using namespace tiny_dnn;
using namespace std;

int main(int argc, char *argv[]){
    network<sequential> net;
    net.load("xor_net");

    vec_t in = {atof(argv[1]), atof(argv[2])};
    vec_t result = net.predict(in); 
    cout << result.at(0) << endl;

    return 0;
}

All we do is loading the trained xor_net.

  • Step 5 : compiling and testing…
    Compile the file using g++ and add option for pthread.
g++ -std=c++14 -Itiny-dnn -pthread -O2 train.cc -o train
g++ -std=c++14 -Itiny-dnn -pthread -O2 test.cc -o test

The program should give result as follows :

$ ./train
mse : 0.0360134
$ ./test 0 0
0.0892
$ ./test 0 1
0.9934
$ ./test 1 0
0.9874
$ ./test 1 1
0.0986

That’s all the program to solve XOR problem using tiny-dnn. You may configure the network for more complicated architecture to solve more complex problem. In near future, tiny-dnn will support GPU processing to make our net more powerful.


Correct me if I am wrong
Sincerely,

Tirtadwipa Manunggal
tirtadwipa.manunggal@gmail.com




8 comments:

  1. So, I'm trying to run your examples, but I keep on getting a bunch of errors when trying to compile with g++ using the commands written here. These errors seem to be in TinyDNN, but I am able to successfully compile the examples with no problem. I am using your steps exactly. Is there something missing that I am just not understanding?

    ReplyDelete
    Replies
    1. Dear Unknown, I am sorry because I had barely maintained this blog. It seems the developer had changed how layer template should be declared. I have revised my code up there for recent codes. And also it's my mistake to add flag for C++11, it should be C++14. You could rerun the codes and it should be working. Please let me know if you have any difficulties. Enjoy your time.

      Delete
    2. Thanks! That worked perfectly!
      Thank you for your quick response!

      Delete
  2. Great tutorial Mr. Tirta! just to avoid warning, I think it should be "float" type in front of `vec_t in` and `vec_t result`. am I right..?

    ReplyDelete
    Replies
    1. I don't think so. 'vec_t' is a user-defined type as we can see at tiny_dnn/util/util.h line 63. It had been explicitly stated that vec_t is a float-typed linked-list. Thus giving type float in front of vec_t calling is not necessary and might cause syntax error.

      Delete
  3. Thanks for this nice tutorial.
    Would you accept it to be merged into the tiny-dnn github?

    ReplyDelete
  4. Thanks for this nice tutorial.
    Would you accept it to be merged int the tiny-dnn github?

    ReplyDelete