Subscribe to DSC Newsletter

Suppose while modeling the training data we use clustering initially to group together objects and then apply decision trees for the data which belongs to cluster 1. Then for the test data how do we conclue that this data belongs to which cluster or do we have to carry out clustering again along with the training data and how do we apply the decision trees which we used for cluster 1?

Views: 82

Comment

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Ralph Winters on November 25, 2010 at 12:21pm
Yes you can do this in R. For example if you are using kmeans to cluster, you can use the kmeans.predict function to assign test cases to a group. It is fairly straightword.

-Ralph Winters
Comment by Minethedata on November 25, 2010 at 9:59am
Thanks Ralph for your repsonse.But doing this in Matlab does not seem to be possible as matlab does not give the centroids. I don't know if R gives this. Please let me know if know if R gives this.
Comment by Ralph Winters on November 25, 2010 at 8:24am
Score the new observations in the test data set according to the centroid definitions obtained in the training data set and then follow the same decision tree logic determined from the training data set.

-Ralph Winters

On Data Science Central

© 2020   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service