Semantic Similarity
A measure of how closely related the meaning of two pieces of text (or other data) are, typically computed using embeddings and distance metrics like cosine similarity.
Applications
Duplicate detection, paraphrase identification, search relevance scoring, document clustering, plagiarism detection, and as the core mechanism in semantic search and RAG systems.
Models
Sentence-BERT, OpenAI embeddings, Cohere embed, and other embedding models produce vectors where cosine similarity between vectors correlates with human judgments of semantic relatedness.