Cosine Similarity
A metric that measures the angle between two vectors, quantifying how similar their directions are regardless of magnitude. Values range from -1 (opposite) to 1 (identical).
The Math
Cosine similarity = (A dot B) / (||A|| * ||B||). It measures the cosine of the angle between two vectors. A value of 1 means the vectors point in the same direction, 0 means they are orthogonal, and -1 means they point in opposite directions.
Why It's Preferred for Embeddings
Unlike Euclidean distance, cosine similarity is invariant to vector magnitude. Two documents about the same topic but of different lengths will have similar cosine similarity. This makes it ideal for comparing text embeddings, where magnitude often reflects document length rather than meaning.
Applications
Semantic search, recommendation systems, duplicate detection, document clustering, and as the core similarity metric in vector databases used for RAG systems.