vector similarity between "centered rating vectors"
Suppose the two rating vectors are
[r11 r12 r13 r14]
and
[r21 r22 r23 r24]
Centering means subtracting the mean of the vector from the vector
let r1 be the mean of r11..r14 and r2 be the mean of r21 ..r24
then centered vectors are
[r11-r1 r12-r1 r13-r1 r14-r1]
[r21-r2 r22-r2 r23-r2 r24-r2]
now if you take the cosine theta metric between these two vectors, you get
dot product divided by the norm of both vectors.
dot product will be of the form [r11-r1]*[r21-r2]+ ...+[r14-r1]*[r24-r2]
this is the numerator of pearson correlation coefficient.
the norm of the first vector is
sqrt [( r11-r1)^2+..(r14-r1]^2]
which can also be viewed as the squared variance of the first vector..
QED
Rao
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.