概述:
余弦相似度 是對兩個向量相似度的描述,表現為兩個向量的夾角的余弦值。當方向相同時(調度為0),余弦值為1,標識強相關;當相互垂直時(在線性代數里,兩個維度垂直意味着他們相互獨立),余弦值為0,標識他們無關。
Cosine similarity
is a measure of similarity between two vectors of an
inner product space
that measures the
cosine
of the angle between them. The cosine of 0° is 1, and it is less than 1 for any other angle. It is thus a judgement of orientation and not magnitude: two vectors with the same orientation have a Cosine similarity of 1, two vectors at 90° have a similarity of 0, and two vectors diametrically opposed have a similarity of -1, independent of their magnitude. Cosine similarity is particularly used in positive space, where the outcome is neatly bounded in [0,1].
定義
基礎知識。。
The cosine of two vectors can be derived by using the Euclidean dot product formula:
Given two vectors of attributes, A and B, the cosine similarity, cos(θ), is represented using a dot product and magnitude as
The resulting similarity ranges from −1 meaning exactly opposite, to 1 meaning exactly the same, with 0 usually indicating independence, and in-between values indicating intermediate similarity or dissimilarity.
與皮爾森相關系數的關系
If the attribute vectors are normalized by subtracting the vector means (e.g.,
), the measure is called centered cosine similarity and is equivalent to the
Pearson Correlation Coefficient
.
