At the end of today's class, we saw that classification is in some sense a pretty easy extension of clustering--training data with different labels can be seen to be making up the different clusters. When test data comes, we just need to figure out which cluster it is closest to and assign it the label of that cluster.
1. If classification is so darned straightforward, how come the whole entire field of machine learning is obsessed with more and more approaches for classification? What can be possibly wrong with the straightforward one we outlined? Can you list any problems our simple approach can run into? (Alternately, it is fine to just decide that Jieping Ye and Huan Liu cannot leave good enough alone... :-)
2. If you listed some problems in 1 (as against casting aspersions on Ye and Liu), then can you comment on the ramifications of those problems on clustering itself? Or is it that clustering is still pretty fine as it is?