Please select some of variables which u fell is important for cluster analysis. Once u decide upon no of cluster then merger cluster result with original data set and do profiling for rest of unused variables . Yes you can find good articles on latent class analysis using SAS EG. Just do some Google search. If you are planning to use R then there are lot of packages available which can be used for latent class analysis.
That was quite informative.
Are you suggesting that i use these dummies or use this woe variables with the other continuous variables in the equation?
I will be using proc fastclus in SAS, so should i have all these variables & dummies together in the "Var" statement ?
I had similar kind of data in one of my clustering project. I used expected maximization clustering techniques which is based on prior probability distribution and likelihood based algo. In fact you can also do Latent class analysis for such mixed type of data. I will recommend for expected maximization algo for the clustering.
I do not know whether this is possible with SAS EG i used open source tool weka to do this analysis.
I hope this will help you
I wouldn't recommend recoding categorical variables into numerics. I would stick with decision trees, correspondence analysis, or latent class analysis. You cannot do latent class analysis in SAS using EG, but there is a PROC LCA which will do the trick.
Main reason is that nominal categorical variables do not have order. for others, you are assigning them arbitrarily. The dummy variable technique is fine for regression where the effects are additive, but am not sure how I would interpret them in a cluster analysis with multi levels. Maybe adding with 1 binary variable would be OK.
Haven't tried Proc LCA in SAS EG, but it might work in the code node.