Tuesday, April 13, 2010

Clarification on the "rand-index" based external cluster evaluation


 In talking to a student, I realized I didn't make something clear in describing the rand-index based external cluster evaluation method.

Rand-Index classifies every pair of entities e1,e2 into four categories

[in-the-same-ground-truth-cluster, in-the-same-generated-cluster]  A
[in-the-same-ground-truth-cluster, in-different-generated-clusters]  C
[in-different-ground-truth-clusters, in-the-same-generated-cluster]  B
[in-different-ground-truth-clusters, in-different-generated-clusters]  D

If A,B,C,D are the number in each class, then A+B+C+D will be n*(n-1)/2 (which is the number of pairs over n entities).

I modified the slide to make this point clear..


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.