Subscribe to DSC Newsletter

Hi guys,

I want to perform Clustering on categorical variables (from survey data, with around 2000 observations).

I guess PROC CLUSTER is appropriate for that. (if we create dummy variables..)

Can you please confirm on this and suggest how to go about using this?

Which method to use? (Wards?)

And how do we reduce the number of variables in this process?

How do we calculate distance between the clusters in this case? 

How do we score the new dataset?

Thanks in advance.

Views: 800

Reply to This

Replies to This Discussion

It would be better to use some type of latent class analysis if you are using categorical variables. SAS has an LCA proc or there are several packages available like LatentGold.

RSS

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service