I was wondering how does FASTCLUS arrive at the optimum number of clusters and how does it chose the initial seeds?


If you are referring to the SAS proc fastclus, it does not arrive at an optimum number of clusters, you are required to determine the apprpriate number of clusters (often through trial and error) and by using the provided model metrics. It also randomly generated seeds, so it can provide a different model everytime you run it. You can save the seeds used in a run to a file to re-use if you happen upon a very good model, or you can provide the seed dataset if you have used some other method to generate seeds.


