A Data Science Central Community
Hi,
I am trying to perform clustering on my customer files with about 80K customers and 50 variables.
Instead of using either just hierarchical or non-hierarchical methods in SAS, I first tried to determine the "OPTIMAL" number of clusters and their seeds using PROC CLUSTER.
Next, I will feed this information/seeds into PROC FASTCLUS to refine the clusters. This was the recommendation that someone gave to me: use hierarchical method first to get the seeds and feed the seeds to non-hierarchical methods to fine tune the clusters.
However, it took forever for PROC CLUSTER to even create clusters for my 80K customers. I had to abandoned it before it returned any result.
Can anyone suggest a way to deal with big data set like mine? Thanks.
Tags:
Hi Kumud,
I need some clarification. I know that clustering can be used with binary transformation using distance matric but can fastclust be used in the same fashion. Please let me know your thoughts on this.
Thanks,
Deepa
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles