This chapter discusses one type of unsupervised competitive learning, the Kohonen feature map, or self-organizing map (SOM). As you recall, in unsupervised learning there are no expected outputs presented to a neural network, as in a supervised training algorithm such as backpropagation. Instead, a network, by its self-organizing properties, is able to infer relationships and learn more as more inputs are presented to it. One advantage to this scheme is that you can expect the system to change with changing conditions and inputs. The system constantly learns. The Kohonen SOM is a neural network system developed by Teuvo Kohonen of Helsinki University of Technology and is often used to classify inputs into different categories. Applications for feature maps can be traced to many areas, including speech recognition and robot motor control.
A Kohonen feature map may be used by itself or as a layer of another neural network. A Kohonen layer is composed of neurons that compete with each other. Like in Adaptive Resonance Theory, the Kohonen SOM is another case of using a winner-take-all strategy. Inputs are fed into each of the neurons in the Kohonen layer (from the input layer). Each neuron determines its output according to a weighted sum formula:
Output = Σ wij xi
The weights and the inputs are usually normalized, which means that the magnitude of the weight and input vectors are set equal to one. The neuron with the largest output is the winner. This neuron has a final output of 1. All other neurons in the layer have an output of zero. Differing input patterns end up firing different winner neurons. Similar or identical input patterns classify to the same output neuron. You get like inputs clustered together. In Chapter Application to Pattern Recognition, you will see the use of a Kohonen network in pattern classification.
Consider a vector, A = ax + by + cz. The normalized vector A’ is obtained by dividing each component of A by the square root of the sum of squares of all the components. In other words each component is multiplied by 1/ [radic](a2 + b2 + c2). Both the weight vector and the input vector are normalized during the operation of the Kohonen feature map. The reason for this is the training law uses subtraction of the weight vector from the input vector. Using normalization of the values in the subtraction reduces both vectors to a unit-less status, and hence, makes the subtraction of like quantities possible. You will learn more about the training law shortly.
Lateral inhibition is a process that takes place in some biological neural networks. Lateral connections of neurons in a given layer are formed, and squash distant neighbors. The strength of connections is inversely related to distance. The positive, supportive connections are termed as excitatory while the negative, squashing connections are termed inhibitory.
A biological example of lateral inhibition occurs in the human vision system.
Figure The mexican hat function showing lateral inhibition shows a function, called the mexican hat function, which shows the relationship between the connection strength and the distance from the winning neuron. The effect of this function is to set up a competitive environment for learning. Only winning neurons and their neighbors participate in learning for a given input pattern.
The mexican hat function showing lateral inhibition.
The training law for the Kohonen feature map is straightforward. The change in weight vector for a given output neuron is a gain constant, alpha, multiplied by the difference between the input vector and the old weight vector:
Wnew = Wold + alpha * (Input -Wold)
Both the old weight vector and the input vector are normalized to unit length. Alpha is a gain constant between 0 and 1.
Let us consider the case of a two-dimensional input vector. If you look at a unit circle, as shown in Figure The training law for the Kohonen map as shown on a unit circle, the effect of the training law is to try to align the weight vector and the input vector. Each pattern attempts to nudge the weight vector closer by a fraction determined by alpha. For three dimensions the surface becomes a unit sphere instead of a circle. For higher dimensions you term the surface a hypersphere. It is not necessarily ideal to have perfect alignment of the input and weight vectors. You use neural networks for their ability to recognize patterns, but also to generalize input data sets. By aligning all input vectors to the corresponding winner weight vectors, you are essentially memorizing the input data set classes. It may be more desirable to come close, so that noisy or incomplete inputs may still trigger the correct classification.
The training law for the Kohonen map as shown on a unit circle.
In the Kohonen map, a parameter called the neighborhood size is used to model the effect of the mexican hat function. Those neurons that are within the distance specified by the neighborhood size participate in training and weight vector changes; those that are outside this distance do not participate in learning. The neighborhood size typically is started as an initial value and is decreased as the input pattern cycles continue. This process tends to support the winner-take-all strategy by eventually singling out a winner neuron for a given pattern.
Figure Winner neuron with a neighborhood size of 2 for a Kohonen map shows a linear arrangement of neurons with a neighborhood size of 2. The hashed central neuron is the winner. The darkened adjacent neurons are those that will participate in training.
Winner neuron with a neighborhood size of 2 for a Kohonen map.
Besides the neighborhood size, alpha typically is also reduced during
simulation. You will see these features when we develop a Kohonen
map program.
The C++ code for the Kohonen map draws on much of the code developed for the backpropagation simulator. The Kohonen map is a much simpler program and may not rely on as large a data set for input. The Kohonen map program uses only two files, an input file and an output file. In order to use the program, you must create an input data set and save this in a file called input.dat. The output file is called kohonen.dat and is saved in your current working directory. You will get more details shortly on the formats of these files.
The Kohonen network has two layers, an input layer and a Kohonen output layer. (See Figure A Kohonen network). The input layer is a size determined by the user and must match the size of each row (pattern) in the input data file.
A Kohonen network.
The mexican hat function shows positive values for an immediate neighborhood around the neuron and negative values for distant neurons. A true method of modeling would incorporate mutual excitation or support for neurons that are within the neighborhood (with this excitation increasing for nearer neurons) and inhibition for distant neurons outside the neighborhood. For the sake of computational efficiency, we model lateral inhibition and excitation by looking at the maximum output for the output neurons and making that output belong to a winner neuron. Other outputs are inhibited by setting their outputs to zero. Training, or weight update, is performed on all outputs that are within a neighborhood size distance from the winner neuron. Neurons outside the neighborhood do not participate in training. The true way of modeling lateral inhibition would be too expensive since the number of lateral connections is quite large. You will find that this approximation will lead to a network with many if not all of the properties of a true modeling approach of a Kohonen network.
We use many of the classes from the backpropagation simulator. We require only two layers, the input layer and the Kohonen layer. We make a new layer class called the Kohonen layer class, and a new network class called the Kohonen_network.
The layer class needs to be slightly modified, as shown in Listing Modification of layer.h
// layer.h V.Rao, H. Rao// header file for the layer class hierarchy and// the network class #define MAX_LAYERS 5#define MAX_VECTORS 100 class network;class Kohonen_network;
class layer{ protected: int num_inputs; int num_outputs; float *outputs;// pointer to array of outputs float *inputs; // pointer to array of inputs, which // are outputs of some other layer friend network; friend Kohonen_network; // update for Kohonen model public: virtual void calc_out()=0;};...
Here the changes are indicated in italic. You notice that the Kohonen_network is made a friend to the layer class, so that the Kohonen_network can have access to the data of a layer.
The
next step to take is to create a Kohonen_layer
class and a Kohonen_network class. This
is shown in Listing The Kohonen_layer class and Kohonen_network
class in layerk.h
// layerk.h V.Rao, H. Rao
// header file for the Kohonen layer and// the Kohonen network class Kohonen_network; class Kohonen_layer: public layer {
protected: float * weights; int winner_index; float win_distance; int neighborhood_size; friend Kohonen_network; public: Kohonen_layer(int, int, int); ~Kohonen_layer(); virtual void calc_out(); void randomize_weights(); void update_neigh_size(int); void update_weights(const float); void list_weights(); void list_outputs(); float get_win_dist(); }; class Kohonen_network {
private: layer *layer_ptr[2]; int layer_size[2]; int neighborhood_size; public: Kohonen_network(); ~Kohonen_network(); void get_layer_info(); void set_up_network(int); void randomize_weights(); void update_neigh_size(int); void update_weights(const float); void list_weights(); void list_outputs(); void get_next_vector(FILE *); void process_next_pattern(); float get_win_dist(); int get_win_index(); };
The
Kohonen_layer is derived from the layer
class, so it has pointers inherited that point to a set of outputs and a set of
inputs. Let’s look at some of the functions and member variables.
Kohonen_layer:
• float * weights Pointer to the weights matrix.
• int winner_index Index value of the output, which is the winner.
• float win_distance The Euclidean distance of the winner weight vector from the input vector.
• int neighborhood_size The size of the neighborhood.
• Kohonen_layer(int, int, int) Constructor for the layer: inputs, outputs, and the neighborhood size.
• ~Kohonen_layer() Destructor.
• virtual void calc_out() The function to calculate the outputs; for the Kohonen layer this models lateral competition.
• void randomize_weights() A function to initialize weights with random normal values.
• void update_neigh_size(nt) This function updates the neighborhood size with a new value.
• void update_weights(const float) This function updates the weights according to the training law using the passed parameter, alpha.
• void list_weights() This function can be used to list the weight matrix.
• void list_outputs() This function is used to write outputs to the output file.
• float get_win_dist() Returns the Euclidean distance between the winner weight vector and the input vector.
Kohonen_network:
• layer *layer_ptr[2] Pointer array; element 0 points to the input layer, element 1 points to the Kohonen layer.
• int layer_size[2] Array of layer sizes for
the two layers.
• int neighborhood_size The current neighborhood size.
• Kohonen_network() Constructor.
• ~Kohonen_network() Destructor.
• void get_layer_info() Gets information about the layer sizes.
• void set_up_network(int) Connects layers and sets up the Kohonen map.
• void randomize_weights() Creates random normalized weights.
• void update_neigh_size(int) Changes the neighborhood size.
• void update_weights(const float) Performs weight update according to the training law.
• void list_weights() Can be used to list the weight matrix.
• void list_outputs() Can be used to list outputs.
• void get_next_vector(FILE *) Function gets another input vector from the input file.
• void process_next_pattern() Applies pattern to the Kohonen map.
• float get_win_dist() Returns the winner’s distance from the input vector.
• int get_win_index() Returns the index of the winner.
This listing shows the layerk.cpp file, which has the implementation of the functions outlined.
Implementation
file for the Kohonen layer and Kohonen
network :layerk.cpp
// layerk.cpp V.Rao, H.Rao// compile for floating point hardware if available#include “layer.cpp”#include “layerk.h” // ————————————————————-// Kohonen layer//————————————————————— Kohonen_layer::Kohonen_layer(int i, int o, int init_neigh_size) {
num_inputs=i;
num_outputs=o;
neighborhood_size=init_neigh_size;weights = new float[num_inputs*num_outputs];outputs = new float[num_outputs];} Kohonen_layer::~Kohonen_layer()
{delete [num_outputs*num_inputs] weights;delete [num_outputs] outputs;}void Kohonen_layer::calc_out(){// implement lateral competition// choose the output with the largest// value as the winner; neighboring// outputs participate in next weight// update. Winner’s output is 1 while// all other outputs are zero int i,j,k;
float accumulator=0.0;float maxval;winner_index=0;maxval=-1000000;
for (j=0; j<num_outputs; j++) { for (i=0; i<num_inputs; i++) { k=i*num_outputs; if (weights[k+j]*weights[k+j] > 1000000.0) { cout << “weights are blowing up\n”; cout << “try a smaller learning constant\n”; cout << “e.g. beta=0.02 aborting...\n”; exit(1); } outputs[j]=weights[k+j]*(*(inputs+i)); accumulator+=outputs[j]; } // no squash function outputs[j]=accumulator; if (outputs[j] > maxval) { maxval=outputs[j]; winner_index=j; } accumulator=0; } // set winner output to 1 outputs[winner_index]=1.0; // now zero out all other outputs for (j=0; j< winner_index; j++) outputs[j]=0; for (j=num_outputs-1; j>winner_index; j—) outputs[j]=0; } void Kohonen_layer::randomize_weights(){int i, j, k;
const unsigned first_time=1; const unsigned not_first_time=0;float discard;float norm; discard=randomweight(first_time); for (i=0; i< num_inputs; i++) { k=i*num_outputs; for (j=0; j< num_outputs; j++) { weights[k+j]=randomweight(not_first_time); } } // now need to normalize the weight vectors// to unit length// a weight vector is the set of weights for// a given output for (j=0; j< num_outputs; j++) { norm=0; for (i=0; i< num_inputs; i++) { k=i*num_outputs; norm+=weights[k+j]*weights[k+j];} norm = 1/((float)sqrt((double)norm)); for (i=0; i< num_inputs; i++) { k=i*num_outputs; weights[k+j]*=norm; } } } void Kohonen_layer::update_neigh_size(int new_neigh_size){neighborhood_size=new_neigh_size;} void Kohonen_layer::update_weights(const float alpha){int i, j, k;
int start_index, stop_index;
// learning law: weight_change =// alpha*(input-weight)// zero change if input and weight// vectors are aligned// only update those outputs that// are within a neighborhood’s distance// from the last winnerstart_index = winner_index - neighborhood_size; if (start_index < 0) start_index =0; stop_index = winner_index + neighborhood_size; if (stop_index > num_outputs-1) stop_index = num_outputs-1; for (i=0; i< num_inputs; i++) { k=i*num_outputs; for (j=start_index; j<=stop_index; j++) weights[k+j] += alpha*((*(inputs+i))-weights[k+j]); } } void Kohonen_layer::list_weights(){int i, j, k;
for (i=0; i< num_inputs; i++) { k=i*num_outputs; for (j=0; j< num_outputs; j++) cout << “weight[“<<i<<”,”<< j<<”] is: “<<weights[k+j]; }} void Kohonen_layer::list_outputs(){int i;
for (i=0; i< num_outputs; i++) { cout << “outputs[“<<i<< “] is: “<<outputs[i]; } } float Kohonen_layer::get_win_dist(){int i, j, k;
j=winner_index;float accumulator=0; float * win_dist_vec = new float [num_inputs]; for (i=0; i< num_inputs; i++) { k=i*num_outputs; win_dist_vec[i]=(*(inputs+i))-weights[k+j]; accumulator+=win_dist_vec[i]*win_dist_vec[i]; } win_distance =(float)sqrt((double)accumulator); delete [num_inputs]win_dist_vec; return win_distance; } Kohonen_network::Kohonen_network()
{ } Kohonen_network::~Kohonen_network()
{} void Kohonen_network::get_layer_info(){int i;
//—————————————————————//// Get layer sizes for the Kohonen network//// ————————————————————- cout << “ Enter in the layer sizes separated by spaces.\n”;
cout << “ A Kohonen network has an input layer \n”;
cout << “ followed by a Kohonen (output) layer \n”;
for (i=0; i<2; i++) { cin >> layer_size[i]; } // ———————————————————————————// size of layers:// input_layer layer_size[0]// Kohonen_layer layer_size[1]//———————————————————————————- }void Kohonen_network::set_up_network(int nsz){int i;
// set up neighborhood sizeneighborhood_size = nsz; //———————————————————————————-// Construct the layers//———————————————————————————- layer_ptr[0] = new input_layer(0,layer_size[0]); layer_ptr[1] = new Kohonen_layer(layer_size[0], layer_size[1],neighborhood_size); for (i=0;i<2;i++) { if (layer_ptr[i] == 0) { cout << “insufficient memory\n”; cout << “use a smaller architecture\n”; exit(1); } } //———————————————————————————-// Connect the layers//———————————————————————————-// set inputs to previous layer outputs for the Kohonen layerlayer_ptr[1]->inputs = layer_ptr[0]->outputs; } void Kohonen_network::randomize_weights(){ ((Kohonen_layer *)layer_ptr[1]) ->randomize_weights();} void Kohonen_network::update_neigh_size(int n){((Kohonen_layer *)layer_ptr[1]) ->update_neigh_size(n);} void Kohonen_network::update_weights(const float a){((Kohonen_layer *)layer_ptr[1]) ->update_weights(a);} void Kohonen_network::list_weights(){((Kohonen_layer *)layer_ptr[1]) ->list_weights();} void Kohonen_network::list_outputs(){((Kohonen_layer *)layer_ptr[1]) ->list_outputs();} void Kohonen_network::get_next_vector(FILE * ifile){int i;
float normlength=0;int num_inputs=layer_ptr[1]->num_inputs;
float *in = layer_ptr[1]->inputs;// get a vector and normalize itfor (i=0; i<num_inputs; i++) { fscanf(ifile,”%f”,(in+i)); normlength += (*(in+i))*(*(in+i)); }fscanf(ifile,”\n”);
normlength = 1/(float)sqrt((double)normlength);
for (i=0; i< num_inputs; i++) { (*(in+i)) *= normlength; } } void Kohonen_network::process_next_pattern(){ layer_ptr[1]->calc_out();} float Kohonen_network::get_win_dist(){float retval;retval=((Kohonen_layer *)layer_ptr[1])
->get_win_dist(); return retval;} int Kohonen_network::get_win_index()
{return ((Kohonen_layer *)layer_ptr[1]) ->winner_index;}
The main() function is contained in a file called kohonen.cpp, which is shown in next listing. To compile this program, you need only compile and make this main file, kohonen.cpp. Other files are included in this.
The
main implementation file, kohonen.cpp for the Kohonen Map program
// kohonen.cpp V. Rao, H. Rao// Program to simulate a Kohonen map #include “layerk.cpp” #define INPUT_FILE “input.dat”#define OUTPUT_FILE “kohonen.dat”#define dist_tol 0.05 void main(){ int neighborhood_size, period;
float avg_dist_per_cycle=0.0;float dist_last_cycle=0.0;float avg_dist_per_pattern=100.0; // for the latest cyclefloat dist_last_pattern=0.0;float total_dist;float alpha;unsigned startup;int max_cycles;
int patterns_per_cycle=0;
int total_cycles, total_patterns;
// create a network objectKohonen_network knet;
FILE * input_file_ptr, * output_file_ptr; // open input file for readingif ((input_file_ptr=fopen(INPUT_FILE,”r”))==NULL) { cout << “problem opening input file\n”; exit(1); } // open writing file for writingif ((output_file_ptr=fopen(OUTPUT_FILE,”w”))==NULL) { cout << “problem opening output file\n”; exit(1); } // ————————————————————-// Read in an initial values for alpha, and the// neighborhood size.// Both of these parameters are decreased with// time. The number of cycles to execute before// decreasing the value of these parameters is// called the period. Read in a value for the// period.// ————————————————————- cout << “ Please enter initial values for:\n”; cout << “alpha (0.01-1.0),\n”; cout << “and the neighborhood size (integer between 0 and 50)\n”; cout << “separated by spaces, e.g. 0.3 5 \n “; cin >> alpha >> neighborhood_size ; cout << “\nNow enter the period, which is the\n”; cout << “number of cycles after which the values\n”; cout << “for alpha the neighborhood size are decremented\n”; cout << “choose an integer between 1 and 500 , e.g. 50 \n”; cin >> period; // Read in the maximum number of cycles // each pass through the input data file is a cycle cout << “\nPlease enter the maximum cycles for the simulation\n”; cout << “A cycle is one pass through the data set.\n”; cout << “Try a value of 500 to start with\n\n”; cin >> max_cycles; // the main loop//// continue looping until the average distance is less// than the tolerance specified at the top of this file// , or the maximum number of// cycles is exceeded; // initialize counterstotal_cycles=0; // a cycle is once through all the input datatotal_patterns=0; // a pattern is one entry in the input data // get layer informationknet.get_layer_info();
// set up the network connectionsknet.set_up_network(neighborhood_size);
// initialize the weights // randomize weights for the Kohonen layer// note that the randomize function for the// Kohonen simulator generates// weights that are normalized to length = 1knet.randomize_weights();
// write header to output filefprintf(output_file_ptr, “cycle\tpattern\twin index\tneigh_size\tavg_dist_per_pa tern\n”);
fprintf(output_file_ptr, “———————————————————————————\n”);
// main loop startup=1;total_dist=0; while ( (avg_dist_per_pattern > dist_tol) && (total_cycles < max_cycles) || (startup==1) ){startup=0;dist_last_cycle=0; // reset for each cycle
patterns_per_cycle=0;// process all the vectors in the datafile while (!feof(input_file_ptr)) { knet.get_next_vector(input_file_ptr); // now apply it to the Kohonen network knet.process_next_pattern(); dist_last_pattern=knet.get_win_dist();
// print result to output filefprintf(output_file_ptr,”%i\t%i\t%i\t\t%i\t\t%f\n”,
total_cycles,total_patterns,knet.get_win_index(), neighborhood_size,avg_dist_per_pattern); total_patterns++; // gradually reduce the neighborhood size // and the gain, alpha if (((total_cycles+1) % period) == 0) { if (neighborhood_size > 0) neighborhood_size —; knet.update_neigh_size(neighborhood_size); if (alpha>0.1) alpha -= (float)0.1; } patterns_per_cycle++; dist_last_cycle += dist_last_pattern; knet.update_weights(alpha); dist_last_pattern = 0; } avg_dist_per_pattern= dist_last_cycle/patterns_per_cycle;
total_dist += dist_last_cycle;total_cycles++; fseek(input_file_ptr, 0L, SEEK_SET); // reset the file
pointer // to the beginning of // the file} // end main loop cout << “\n\n\n\n\n\n\n\n\n\n\n”;
cout << “—————————————————————————-\n”;
cout << “ done \n”;
avg_dist_per_cycle= total_dist/total_cycles;
cout << “\n”;
cout << “——>average dist per cycle = “ << avg_dist_per_cycle << “ <—-\n”;
cout << “——>dist last cycle = “ << dist_last_cycle << “ <—- \n”;
cout << “->dist last cycle per pattern= “ << avg_dist_per_pattern << “ <—-\n”;
cout << “——————>total cycles = “ << total_cycles << “ <—-\n”;
cout << “——————>total patterns = “ << total_patterns << “ <—- \n”;
cout << “—————————————————————————-\n”;
// close the input filefclose(input_file_ptr);
}
The flow of the program is very similar to the backpropagation simulator. The criterion for ending the simulation in the Kohonen program is the average winner distance. This is a Euclidean distance measure between the input vector and the winner’s weight vector. This distance is the square root of the sum of the squares of the differences between individual vector components between the two vectors.
Once you compile the program, you need to create an input file to try it. We will first use a very simple input file and examine the results.
Let us create an input file, input.dat, which contains only two arbitrary vectors:
|
0.4 0.98 0.1 0.2 |
|
0.5 0.22 0.8 0.9 |
The file contains two four-dimensional vectors. We
expect to see output that contains a different winner neuron for each of these
patterns. If this is the case, then the Kohonen map
has assigned different categories for each of the input vectors, and, in the
future, you can expect to get the same winner classification for vectors that
are close to or equal to these vectors.
By
running the Kohonen map program, you will see the
following output (user input is italic):
Please enter initial values for:alpha (0.01-1.0),and the neighborhood size (integer between 0 and 50)separated by spaces, e.g. 0.3 50.3 5
Now enter the period, which is thenumber of cycles after which the valuesfor alpha the neighborhood size are decrementedchoose an integer between 1 and 500 , e.g. 5050
Please enter the maximum cycles for the simulationA cycle is one pass through the data set.Try a value of 500 to start with500
Enter in the layer sizes separated by spaces. A Kohonen network has an input layer followed by a Kohonen (output) layer4 10
—————————————————————————— done——>average dist per cycle = 0.544275 <——-——>dist last cycle = 0.0827523 <——-->dist last cycle per pattern= 0.0413762 <——-——————>total cycles = 11 <——-——————>total patterns = 22 <——-——————————————————————————
The
layer sizes are given as 4 for the input layer and 10 for the Kohonen layer. You should choose the size of the Kohonen layer to be larger than the number of distinct
patterns that you think are in the input data set. One of the outputs reported
on the screen is the distance for the last cycle per pattern. This value is
listed as 0.04, which is less than the terminating value set at the top of the kohonen.cpp file of 0.05. The map converged on a solution.
Let us look at the file, kohonen.dat, the output
file, to see the mapping to winner indexes:
cycle pattern win index neigh_size avg_dist_per_pattern——————————————————————————————————————————————————————————————————————0 0 1 5 100.0000000 1 3 5 100.0000001 2 1 5 0.3042851 3 3 5 0.3042852 4 1 5 0.5682552 5 3 5 0.5682553 6 1 5 0.5427933 7 8 5 0.5427934 8 1 5 0.5024164 9 8 5 0.5024165 10 1 5 0.3516925 11 8 5 0.3516926 12 1 5 0.2461846 13 8 5 0.2461847 14 1 5 0.1723297 15 8 5 0.1723298 16 1 5 0.1206308 17 8 5 0.1206309 18 1 5 0.0844419 19 8 5 0.08444110 20 1 5 0.05910910 21 8 5 0.059109
In this example, the neighborhood size stays at its initial value of 5. In the first column you see the cycle number, and in the second the pattern number. Since there are two patterns per cycle, you see the cycle number repeated twice for each cycle.
The Kohonen map was able to
find two distinct winner neurons for each of the patterns. One has winner index
1 and the other index 8.
For
a second example, look at Figure Orthogonal
input vectors, where we choose input vectors on a two-dimensional unit
circle that are 90° apart. The input.dat file
should look like the following:
1 0 0 1-1 0 0 -1
Orthogonal input vectors.
Using
the same parameters for the Kohonen network, but with
layer sizes of 2 and 10, what result would you expect? The output file, kohonen.dat, follows:
cycle pattern win index neigh_size avg_dist_per_pattern———————————————————————————————————————————————————————————————————————0 0 4 5 100.0000000 1 0 5 100.0000000 2 9 5 100.0000000 3 3 5 100.0000001 4 4 5 0.4445581 5 0 5 0.444558 497 1991 6 0 0.707107498 1992 0 0 0.707107498 1993 0 0 0.707107498 1994 6 0 0.707107498 1995 6 0 0.707107499 1996 0 0 0.707107499 1997 0 0 0.707107499 1998 6 0 0.707107499 1999 6 0 0.707107
You can see that this example doesn’t quite work. Even though the neighborhood size gradually got reduced to zero, the four inputs did not get categorized to different outputs. The winner distance became stuck at the value of 0.707, which is the distance from a vector at 45°. In other words, the map generalizes a little too much, arriving at the middle value for all of the input vectors.
You
can fix this problem by starting with a smaller neighborhood size, which
provides for less generalization. By using the same parameters and a
neighborhood size of 2, the following output is obtained.
cycle pattern win index neigh_size avg_dist_per_pattern—————————————————————————————————————————————————————————————————————0 0 5 2 100.0000000 1 6 2 100.0000000 2 4 2 100.0000000 3 9 2 100.0000001 4 0 2 0.4316951 5 6 2 0.4316951 6 3 2 0.4316951 7 9 2 0.4316952 8 0 2 0.5047282 9 6 2 0.5047282 10 3 2 0.5047282 11 9 2 0.5047283 12 0 2 0.3533093 13 6 2 0.3533093 14 3 2 0.3533093 15 9 2 0.3533094 16 0 2 0.2473174 17 6 2 0.2473174 18 3 2 0.2473174 19 9 2 0.2473175 20 0 2 0.1731225 21 6 2 0.1731225 22 3 2 0.1731225 23 9 2 0.1731226 24 0 2 0.1211856 25 6 2 0.1211856 26 3 2 0.1211856 27 9 2 0.1211857 28 0 2 0.0848307 29 6 2 0.0848307 30 3 2 0.0848307 31 9 2 0.0848308 32 0 2 0.0593818 33 6 2 0.0593818 34 3 2 0.0593818 35 9 2 0.059381
For
this case, the network quickly converges on a unique winner for each of the
four input patterns, and the distance criterion is below the set criterion
within eight cycles. You can experiment with other input data sets and
combinations of Kohonen network parameters.
There are many variations of the Kohonen network. Some of these will be briefly discussed in this section.
DeSieno has used a conscience factor in a Kohonen network. For a winning neuron, if the neuron is winning more than a fair share of the time (roughly more than 1/n, where n is the number of neurons), then this neuron has a threshold that is applied temporarily to allow other neurons the chance to win. The purpose of this modification is to allow more uniform weight distribution while learning is taking place.
You have read about LVQ (Learning Vector Quantizer) in previous chapters. In light of the Kohonen map, it should be pointed out that the LVQ is simply a supervised version of the Kohonen network. Inputs and expected output categories are presented to the network for training. You get data clustered, just as a Kohonen network, according to the similarity to other data inputs.
A neural network topology, called a counterpropagation network, is a combination of a Kohonen layer with a Grossberg layer. This network was developed by Robert Hecht-Nielsen and is useful for prototyping of systems, with a fairly rapid training time compared to backpropagation. The Kohonen layer provides for categorization, while the Grossberg layer allows for Hebbian conditioned learning. Counterpropagation has been used successfully in data compression applications for images. Compression ratios of 10:1 to 100:1 have been obtained, using a lossy compression scheme that codes the image with a technique called vector quantization, where the image is broken up into representative subimage vectors. The statistics of these vectors is such that you find that a large part of the image can be adequately represented by a subset of all the vectors. The vectors with the highest frequency of occurrence are coded with the shortest bit strings, hence you achieve data compression.
Kohonen created a phonetic typewriter by classifying speech waveforms of different phonemes of Finnish speech into different categories using a Kohonen SOM. The Kohonen phoneme map used 50 samples of each phoneme for calibration. These samples caused excitation in a neighborhood of cells more strongly than in other cells. A neighborhood was labeled with the particular phoneme that caused excitation. For an utterance of speech made to the network, the exact neighborhoods that were active during the utterance were noted, and for how long, and in what sequence. Short excitations were taken as transitory sounds. The information obtained from the network was then pieced together to find out the words in the utterance made to the network.
In this chapter, you have learned about one of the important types of competitive learning called Kohonen feature map. The most significant points of this discussion are outlined as follows:
• The Kohonen feature map is an example of an unsupervised neural network that is mainly used as a classifier system or data clustering system. As more inputs are presented to this network, the network improves its learning and is able to adapt to changing inputs.
• The training law for the Kohonen network tries to align the weight vectors along the same direction as input vectors.
• The Kohonen network models lateral competition as a form of self-organization. One winner neuron is derived for each input pattern to categorize that input.
• Only neurons within a certain distance (neighborhood) from the winner are allowed to participate in training for a given input pattern.