Dekalog Blog: Successful Integration of FANN

Wednesday 5 September 2012

Successful Integration of FANN

I am pleased to say that all my recent work seems to have borne fruit, and I have now managed to code up a training and testing routine in Octave that uses the FANN library and its Octave bindings. I think that this has been some of my most challenging coding work up to now, and required many hours of research on the web and forum help to complete.

I find that one frustration with using open source software is the sparse and sometimes non-existent documentation and this blog post is partly intended as a guide for those readers who may also wish to use FANN in Octave. The code in the code box below is roughly divided into these sections

Octave code to index into and extract the relevant data from previously saved files
a section that uses Perl to format this data
the Octave binding code that actually implements the FANN library functions to set up and train a NN
a short bit of code to save and then test the NN on the training data

As the code itself is heavily commented no further comment is required.

% load training_data_1.mat on command line before running this script.

clear exclusive -X -accurate_period -y

yy = eye(5)(y,:) ; % using training labels y, create an output vector suitable for NN training

period = input('Enter period of interest: ') ;

%for period = 10:50

fprintf('\nTraining for ANN period: %f\n', period ) ;

% This first switch control block creates the training data by indexing, by period, into them
% data loaded from training_data_1.mat
switch (period)

case 10

% index using input period
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , : ) ;
% and same for market labels vector y
y_train = yy( [i_X] , : ) ;

% now index using input period plus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period+1 ) ;
% extract the relevant part of X using above i_X index
X_test = X( [i_X] , : ) ;
y_test = yy( [i_X] , : ) ;

train_data = [ X_train y_train ] ;
test_data = [ X_test y_test ] ;
detect_optima = train_data( (60:60:9000) , : ) ;

case 50

% index using input period
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , : ) ;
% and same for market labels vector y
y_train = yy( [i_X] , : ) ;

% now index using input period minus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period-1 ) ;
% extract the relevant part of X using above i_X index
X_test = X( [i_X] , : ) ;
y_test = yy( [i_X] , : ) ;

train_data = [ X_train y_train ] ;
test_data = [ X_test y_test ] ;
detect_optima = train_data( (60:60:9000) , : ) ;

otherwise

% index using input period
[i_X j_X] = find( accurate_period(:,1) == period ) ;
% extract the relevant part of X using above i_X index
X_train = X( [i_X] , : ) ;
% and same for market labels vector y
y_train = yy( [i_X] , : ) ;

% now index using input period minus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period-1 ) ;
% extract the relevant part of X using above i_X index
X_test_1 = X( [i_X] , : ) ;
% and take every other value
X_test_1 = X_test_1( (2:2:9000) , : ) ;
% and same for market labels vector y
y_test_1 = yy( [i_X] , : ) ;
% and take every other value
y_test_1 = y_test_1( (2:2:9000) , : ) ;

% now index using input period plus 1 for test set
[i_X j_X] = find( accurate_period(:,1) == period+1 ) ;
% extract the relevant part of X using above i_X index
X_test_2 = X( [i_X] , : ) ;
% and take every other value
X_test_2 = X_test_2( (2:2:9000) , : ) ;
% and same for market labels vector y
y_test_2 = yy( [i_X] , : ) ;
% and take every other value
y_test_2 = y_test_2( (2:2:9000) , : ) ;

train_data = [ X_train y_train ] ;
test_data = [ [ X_test_1 y_test_1 ] ; [ X_test_2 y_test_2 ] ] ;
detect_optima = train_data( (60:60:9000) , : ) ;

endswitch % end of training data indexing switch

% now write this selected period data to -ascii files
save data_for_training -ascii train_data
save data_for_testing -ascii test_data
save detect_optima -ascii detect_optima % for use in Fanntool software

%************************************************************************
% Now the FANN training code !                                          *
%************************************************************************

% First set the parameters for the FANN structure
No_of_input_layer_nodes = 102 
No_of_hidden_layer_nodes = 102 
No_of_output_layer_nodes = 5 
Total_no_of_layers = length( [ No_of_input_layer_nodes No_of_hidden_layer_nodes No_of_output_layer_nodes ] )

% save and write this FANN structure info and length of training data file into an -ascii file - "train_nn_from_this_file"
fid = fopen( 'train_nn_from_this_file' , 'w' ) ;
fprintf( fid , ' %i %i %i\n ' , length(train_data) , No_of_input_layer_nodes , No_of_output_layer_nodes ) ;
fclose(fid) ;

% now create the FANN formatted training file - "train_nn_from_this_file"
system( "perl perl_file_manipulate.pl >train_nn_from_this_file" ) ;

%{
The above call to "system" interupts, or pauses, Octave at this point. Now the "shell" or "bash"
takes over and calls a Perl script, "perl_file_manipulate.pl", with the command line arguments
">train_nn_from_this_file", where < indicates that the file "data_for_training"
is to be read by the Perl script and >> indicates that the file "train_nn_from_this_file" is to be 
appended by the Perl script. From the fopen and fclose operations above the file to be appended contains only 
FANN structure info, e.g. 9000 102 5 on one line, and the file that is to be read is the training data of NN features 
and outputs extracted by the switch control structure above and written to -ascii files. The contents of the Perl
script file are:

#!/usr/bin/env perl

while (<>) { 
   my @f = split ;
   print("@f[0..$#f-5]\n@f[-5..-1]\n") ;
}

After these Perl operations the file "train_nn_from_this_file" is correctly formatted for the FANN library calls that
are to come
e.g. the file looks like this:-

9000 102 5
-2.50350699e-09 -2.52301858e-09 -2.50273727e-09 -2.44301942e-09 -2.34482961e-09 -2.20974520e-09
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00
etc.

When all this Perl script stuff is finished control returns to Octave.
%}

%***************************************************************************
% Begin FANN training ! Hurrah !                                           *
%***************************************************************************
% create the FANN
ANN = fann_create( [ No_of_input_layer_nodes No_of_hidden_layer_nodes No_of_output_layer_nodes ] ) ;

% create the parameters for training the FANN in an Octave "struct." All parameters are explicitly stated and set to the
% the default values. If not explicitly stated they would be these values anyway, but are explicitly stated just to show 
% how this is done 
NN_PARAMS = struct( "TrainingAlgorithm", 'rprop', "LearningRate", 0.7, "ActivationHidden", 'Sigmoid', "ActivationOutput", 'Sigmoid',...
"ActivationSteepnessHidden", 0.5, "ActivationSteepnessOutput", 0.5, "TrainErrorFunction", 'TanH', "QuickPropDecay", -0.0001,...
"QuickPropMu", 1.75, "RPropIncreaseFactor", 1.2, "RPropDecreaseFactor", 0.5, "RPropDeltaMin", 0.0, "RPropDeltaMax", 50.0 )

% and then set the parameters
fann_set_parameters( ANN , NN_PARAMS ) ;

% now train the FANN on data contained in file "train_nn_from_this_file"
fann_train( ANN, 'train_nn_from_this_file', 'MaxIterations', 200, 'DesiredError', 0.001, 'IterationsBetweenReports', 10 )

% save the trained FANN in a file e.g. "ann_25.net"
fann_save( ANN , [ "ann_" num2str(period) ".net" ] )

% Now test the ANN on the test_data set
% create ANN from saved fann_save file
ANN = fann_create( [ "ann_" num2str(period) ".net" ] ) ;

% run the trained ANN on the original feature training set, X_train
X_train_FANN_results = fann_run( ANN , X_train ) ;

% convert the X_train_FANN_results matrix to a single prediction vector
[dummy, prediction] = max( X_train_FANN_results, [], 2 ) ;

% compare accuracy of this NN prediction vector with the known labels in y for this period and display 
[i_X j_X] = find( accurate_period(:,1) == period ) ;
fprintf('\nTraining Set Accuracy: %f\n', mean( double( prediction == y([i_X],:) ) ) * 100 ) ;
fprintf('End of training for ANN period: %f\n', period ) ;

%end % end of period for loop

Typical terminal output during the running of this code looks like this:

octave:143> net_train_octave
Enter period of interest: 25
Max epochs      200. Desired error: 0.0010000000.
Epochs            1. Current error: 0.2537834346. Bit fail 45000.
Epochs           10. Current error: 0.1802092344. Bit fail 20947.
Epochs           20. Current error: 0.0793143436. Bit fail 7380.
Epochs           30. Current error: 0.0403240845. Bit fail 5215.
Epochs           40. Current error: 0.0254898760. Bit fail 2853.
Epochs           50. Current error: 0.0180807728. Bit fail 1611.
Epochs           60. Current error: 0.0150692556. Bit fail 1414.
Epochs           70. Current error: 0.0119200321. Bit fail 1187.
Epochs           80. Current error: 0.0091521516. Bit fail 937.
Epochs           90. Current error: 0.0073408978. Bit fail 670.
Epochs          100. Current error: 0.0060765576. Bit fail 492.
Epochs          110. Current error: 0.0051601632. Bit fail 446.
Epochs          120. Current error: 0.0041675218. Bit fail 386.
Epochs          130. Current error: 0.0036309268. Bit fail 374.
Epochs          140. Current error: 0.0032380833. Bit fail 343.
Epochs          150. Current error: 0.0028855132. Bit fail 302.
Epochs          160. Current error: 0.0025165526. Bit fail 280.
Epochs          170. Current error: 0.0022868335. Bit fail 253.
Epochs          180. Current error: 0.0021089041. Bit fail 220.
Epochs          190. Current error: 0.0019043182. Bit fail 197.
Epochs          200. Current error: 0.0017739790. Bit fail 169.

Training for ANN period: 25.000000
No_of_input_layer_nodes = 102
No_of_hidden_layer_nodes = 102
No_of_output_layer_nodes = 5
Total_no_of_layers = 3
NN_PARAMS =

scalar structure containing the fields:

    TrainingAlgorithm = rprop
    LearningRate = 0.70000
    ActivationHidden = Sigmoid
    ActivationOutput = Sigmoid
    ActivationSteepnessHidden = 0.50000
    ActivationSteepnessOutput = 0.50000
    TrainErrorFunction = TanH
    QuickPropDecay = -1.0000e-04
    QuickPropMu = 1.7500
    RPropIncreaseFactor = 1.2000
    RPropDecreaseFactor = 0.50000
    RPropDeltaMin = 0
    RPropDeltaMax = 50

Training Set Accuracy: 100.000000
End of training for ANN period: 25.000000

The accuracy obtained on all periods from 10 to 50 is at least 98%, with about two thirds being 100%. However, the point of this post is not to show results of any one set of NN features or training parameters, but rather that I can now be more productive by using the speed and flexibility of FANN in the development of my NN market classifier.

Dekalog Blog

Pages

Wednesday 5 September 2012

Successful Integration of FANN

No comments: