K-Means Clustering Code
I've been having a hard time trying to train a Self-Organizing Map to categorize a large pool of short documents by my selected keywords. The initial results were promising but I couldn't adjust the training parameters well enough to train the error sufficiently low that I would be confident of the categorizations, even after several days of training.
Some nice folks on comp.ai.neural-nets suggested a few other techniques and I've implemented simple routines to perform K-Means Clustering. The categorization of my 3300 documents by 323 keywords now takes less than 10 seconds.download my k-means c source
(This source uses raw float arrays and includes a function to categorize the vectors in a FANN training data struct.)