got stuck in Cluster analysis. - AnalyticBridge2019-03-24T00:28:21Zhttps://www.analyticbridge.datasciencecentral.com/forum/topics/got-stuck-in-cluster-analysis?feed=yes&xn_auth=noThink about the data that you…tag:www.analyticbridge.datasciencecentral.com,2014-03-15:2004291:Comment:2909152014-03-15T19:41:52.231ZEdmund Freemanhttps://www.analyticbridge.datasciencecentral.com/profile/EmundFreeman
<p>Think about the data that you are trying to cluster with. How many dimensions are you using? Are the variables highly related? DO the variables have different standard deviations? What is the distribution? </p>
<p></p>
<p>For instance, if your data is log-normal then a lot of the cases will be in the low end of the distribution with a few at the high end. If you have a bunch of highly correlated log-normal variables, that could get the kind of results you are…</p>
<p>Think about the data that you are trying to cluster with. How many dimensions are you using? Are the variables highly related? DO the variables have different standard deviations? What is the distribution? </p>
<p></p>
<p>For instance, if your data is log-normal then a lot of the cases will be in the low end of the distribution with a few at the high end. If you have a bunch of highly correlated log-normal variables, that could get the kind of results you are seeing.</p>
<p></p>
<p>Clustering is often treated as a garbage-disposal method; toss anything in and it gets crunched. I find that one has to put a lot of thought into the variables used to get meaningful results.</p>
<p></p> Hi Suresh, have you derived a…tag:www.analyticbridge.datasciencecentral.com,2014-03-14:2004291:Comment:2909932014-03-14T22:04:57.610ZAnthony Tatumhttps://www.analyticbridge.datasciencecentral.com/profile/AnthonyTatum
<p>Hi Suresh, have you derived any general statistics on your data? It sounds like the means kurtosis distribution is really high. That could be the correct result... How many variables are you using? Are they independent variables? I think all the variables in a cluster analysis are supposed to be fairly independent. You can run a correlation test to find out. Good luck.</p>
<p>Hi Suresh, have you derived any general statistics on your data? It sounds like the means kurtosis distribution is really high. That could be the correct result... How many variables are you using? Are they independent variables? I think all the variables in a cluster analysis are supposed to be fairly independent. You can run a correlation test to find out. Good luck.</p>