Subscribe to DSC Newsletter

Challenge of the week: Piecewise linear clustering versus SVM

In this challenge, we ask you to invent a new technique for clustering, based on separating hyperplanes. SVM (support vector machines) add many fictitious (dummy) variables and a non-linear mapping (to increase dimensionality and find hyperplanes on transformed variables), thus providing nearly or exact class separation (the purpose of clustering!) when traditional linear clustering fails.

The blue line is the frontier (combination of line segments) between the two classes

Here we also focus on the case when no separating hyperplane exists. For simplicity, let's say that we only have two classes. Here you are asked to develop a technique

  • possibly based on the convex hulls associated with each training set, and investigate what happens when the two convex hulls - one for each class - overlap
  • possibly based on generating many hyperplanes (combinatorial optimization) and identify a stable solution after partitioning the 2-D or 3-D space in a number of simplices (determined by these hyperplanes or segments in D-2 or faces in D-3) 
  • Or using Voronio diagrams

Whatever your technique, it must be based on robust cross-validation. 

Alternate question: what software do you use to display these diagrams?

Related link

Views: 1432

Reply to This

Replies to This Discussion

Can someone post the reference for the Voronoi picture in my article? I was about to abandon it because of the low quality, but then could not find anything that better illustrates this challenge, and I lost the source. Or if you find another image (with source), let me know, and I'll update this article and include the source. I'd lik to have a great Voronoi diagram for the picture of the week, next week.

Thanks Alastair!


On Data Science Central

© 2019 is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service