Wednesday 5 September 2012

Successful Integration of FANN

I am pleased to say that all my recent work seems to have borne fruit, and I have now managed to code up a training and testing routine in Octave that uses the FANN library and its Octave bindings. I think that this has been some of my most challenging coding work up to now, and required many hours of research on the web and forum help to complete.

I find that one frustration with using open source software is the sparse and sometimes non-existent documentation and this blog post is partly intended as a guide for those readers who may also wish to use FANN in Octave. The code in the code box below is roughly divided into these sections
  • Octave code to index into and extract the relevant data from previously saved files
  • a section that uses Perl to format this data
  • the Octave binding code that actually implements the FANN library functions to set up and train a NN
  • a short bit of code to save and then test the NN on the training data
As the code itself is heavily commented no further comment is required.
% load training_data_1.mat on command line before running this script.

clear exclusive -X -accurate_period -y

yy = eye(5)(y,:) ; % using training labels y, create an output vector suitable for NN training

period = input('Enter period of interest: ') ;

%for period = 10:50

fprintf('\nTraining for ANN period: %f\n', period ) ;

% This first switch control block creates the training data by indexing, by period, into them
% data loaded from training_data_1.mat
switch (period)

case 10

% index using input period
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , : ) ;
% and same for market labels vector y
y_train = yy( [i_X] , : ) ;

% now index using input period plus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period+1 ) ;
% extract the relevant part of X using above i_X index
X_test = X( [i_X] , : ) ;
y_test = yy( [i_X] , : ) ;

train_data = [ X_train y_train ] ;
test_data = [ X_test y_test ] ;
detect_optima = train_data( (60:60:9000) , : ) ;

case 50

% index using input period
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , : ) ;
% and same for market labels vector y
y_train = yy( [i_X] , : ) ;

% now index using input period minus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period-1 ) ;
% extract the relevant part of X using above i_X index
X_test = X( [i_X] , : ) ;
y_test = yy( [i_X] , : ) ;

train_data = [ X_train y_train ] ;
test_data = [ X_test y_test ] ;
detect_optima = train_data( (60:60:9000) , : ) ;

otherwise

% index using input period
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , : ) ;
% and same for market labels vector y
y_train = yy( [i_X] , : ) ;

% now index using input period minus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period-1 ) ;
% extract the relevant part of X using above i_X index
X_test_1 = X( [i_X] , : ) ;
% and take every other value
X_test_1 = X_test_1( (2:2:9000) , : ) ;
% and same for market labels vector y
y_test_1 = yy( [i_X] , : ) ;
% and take every other value
y_test_1 = y_test_1( (2:2:9000) , : ) ;

% now index using input period plus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period+1 ) ;
% extract the relevant part of X using above i_X index
X_test_2 = X( [i_X] , : ) ;
% and take every other value
X_test_2 = X_test_2( (2:2:9000) , : ) ;
% and same for market labels vector y
y_test_2 = yy( [i_X] , : ) ;
% and take every other value
y_test_2 = y_test_2( (2:2:9000) , : ) ;

train_data = [ X_train y_train ] ;
test_data = [ [ X_test_1 y_test_1 ] ; [ X_test_2 y_test_2 ] ] ;
detect_optima = train_data( (60:60:9000) , : ) ;

endswitch % end of training data indexing switch

% now write this selected period data to -ascii files
save data_for_training -ascii train_data
save data_for_testing -ascii test_data
save detect_optima -ascii detect_optima % for use in Fanntool software

%************************************************************************
% Now the FANN training code !                                          *
%************************************************************************

% First set the parameters for the FANN structure
No_of_input_layer_nodes = 102 
No_of_hidden_layer_nodes = 102 
No_of_output_layer_nodes = 5 
Total_no_of_layers = length( [ No_of_input_layer_nodes No_of_hidden_layer_nodes No_of_output_layer_nodes ] )

% save and write this FANN structure info and length of training data file into an -ascii file - "train_nn_from_this_file"
fid = fopen( 'train_nn_from_this_file' , 'w' ) ;
fprintf( fid , ' %i %i %i\n ' , length(train_data) , No_of_input_layer_nodes , No_of_output_layer_nodes ) ;
fclose(fid) ;

% now create the FANN formatted training file - "train_nn_from_this_file"
system( "perl perl_file_manipulate.pl >train_nn_from_this_file" ) ;

%{
The above call to "system" interupts, or pauses, Octave at this point. Now the "shell" or "bash"
takes over and calls a Perl script, "perl_file_manipulate.pl", with the command line arguments
">train_nn_from_this_file", where < indicates that the file "data_for_training"
is to be read by the Perl script and >> indicates that the file "train_nn_from_this_file" is to be 
appended by the Perl script. From the fopen and fclose operations above the file to be appended contains only 
FANN structure info, e.g. 9000 102 5 on one line, and the file that is to be read is the training data of NN features 
and outputs extracted by the switch control structure above and written to -ascii files. The contents of the Perl
script file are:

#!/usr/bin/env perl

while (<>) { 
   my @f = split ;
   print("@f[0..$#f-5]\n@f[-5..-1]\n") ;
}

After these Perl operations the file "train_nn_from_this_file" is correctly formatted for the FANN library calls that
are to come
e.g. the file looks like this:-

9000 102 5
-2.50350699e-09 -2.52301858e-09 -2.50273727e-09 -2.44301942e-09 -2.34482961e-09 -2.20974520e-09
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00
etc.

When all this Perl script stuff is finished control returns to Octave.
%}

%***************************************************************************
% Begin FANN training ! Hurrah !                                           *
%***************************************************************************
% create the FANN
ANN = fann_create( [ No_of_input_layer_nodes No_of_hidden_layer_nodes No_of_output_layer_nodes ] ) ;

% create the parameters for training the FANN in an Octave "struct." All parameters are explicitly stated and set to the
% the default values. If not explicitly stated they would be these values anyway, but are explicitly stated just to show 
% how this is done 
NN_PARAMS = struct( "TrainingAlgorithm", 'rprop', "LearningRate", 0.7, "ActivationHidden", 'Sigmoid', "ActivationOutput", 'Sigmoid',...
"ActivationSteepnessHidden", 0.5, "ActivationSteepnessOutput", 0.5, "TrainErrorFunction", 'TanH', "QuickPropDecay", -0.0001,...
"QuickPropMu", 1.75, "RPropIncreaseFactor", 1.2, "RPropDecreaseFactor", 0.5, "RPropDeltaMin", 0.0, "RPropDeltaMax", 50.0 )

% and then set the parameters
fann_set_parameters( ANN , NN_PARAMS ) ;

% now train the FANN on data contained in file "train_nn_from_this_file"
fann_train( ANN, 'train_nn_from_this_file', 'MaxIterations', 200, 'DesiredError', 0.001, 'IterationsBetweenReports', 10 )

% save the trained FANN in a file e.g. "ann_25.net"
fann_save( ANN , [ "ann_" num2str(period) ".net" ] )

% Now test the ANN on the test_data set
% create ANN from saved fann_save file
ANN = fann_create( [ "ann_" num2str(period) ".net" ] ) ;

% run the trained ANN on the original feature training set, X_train
X_train_FANN_results = fann_run( ANN , X_train ) ;

% convert the X_train_FANN_results matrix to a single prediction vector
[dummy, prediction] = max( X_train_FANN_results, [], 2 ) ;

% compare accuracy of this NN prediction vector with the known labels in y for this period and display 
[i_X j_X] = find( accurate_period(:,1) == period ) ;
fprintf('\nTraining Set Accuracy: %f\n', mean( double( prediction == y([i_X],:) ) ) * 100 ) ;
fprintf('End of training for ANN period: %f\n', period ) ;

%end % end of period for loop
Typical terminal output during the running of this code looks like this:

octave:143> net_train_octave
Enter period of interest: 25
Max epochs      200. Desired error: 0.0010000000.
Epochs            1. Current error: 0.2537834346. Bit fail 45000.
Epochs           10. Current error: 0.1802092344. Bit fail 20947.
Epochs           20. Current error: 0.0793143436. Bit fail 7380.
Epochs           30. Current error: 0.0403240845. Bit fail 5215.
Epochs           40. Current error: 0.0254898760. Bit fail 2853.
Epochs           50. Current error: 0.0180807728. Bit fail 1611.
Epochs           60. Current error: 0.0150692556. Bit fail 1414.
Epochs           70. Current error: 0.0119200321. Bit fail 1187.
Epochs           80. Current error: 0.0091521516. Bit fail 937.
Epochs           90. Current error: 0.0073408978. Bit fail 670.
Epochs          100. Current error: 0.0060765576. Bit fail 492.
Epochs          110. Current error: 0.0051601632. Bit fail 446.
Epochs          120. Current error: 0.0041675218. Bit fail 386.
Epochs          130. Current error: 0.0036309268. Bit fail 374.
Epochs          140. Current error: 0.0032380833. Bit fail 343.
Epochs          150. Current error: 0.0028855132. Bit fail 302.
Epochs          160. Current error: 0.0025165526. Bit fail 280.
Epochs          170. Current error: 0.0022868335. Bit fail 253.
Epochs          180. Current error: 0.0021089041. Bit fail 220.
Epochs          190. Current error: 0.0019043182. Bit fail 197.
Epochs          200. Current error: 0.0017739790. Bit fail 169.

Training for ANN period: 25.000000
No_of_input_layer_nodes =  102
No_of_hidden_layer_nodes =  102
No_of_output_layer_nodes =  5
Total_no_of_layers =  3
NN_PARAMS =

  scalar structure containing the fields:

    TrainingAlgorithm = rprop
    LearningRate =  0.70000
    ActivationHidden = Sigmoid
    ActivationOutput = Sigmoid
    ActivationSteepnessHidden =  0.50000
    ActivationSteepnessOutput =  0.50000
    TrainErrorFunction = TanH
    QuickPropDecay = -1.0000e-04
    QuickPropMu =  1.7500
    RPropIncreaseFactor =  1.2000
    RPropDecreaseFactor =  0.50000
    RPropDeltaMin = 0
    RPropDeltaMax =  50

Training Set Accuracy: 100.000000
End of training for ANN period: 25.000000

The accuracy obtained on all periods from 10 to 50 is at least 98%, with about two thirds being 100%. However, the point of this post is not to show results of any one set of NN features or training parameters, but rather that I can now be more productive by using the speed and flexibility of FANN in the development of my NN market classifier.

No comments: