Embedding Model
A model specifically designed to convert text, images, or other data into dense vector representations (embeddings) that capture semantic meaning.
How They Work
Embedding models are typically encoder-only transformers trained with contrastive learning objectives. They process input text and output a fixed-size vector (e.g., 768 or 1536 dimensions) that captures the input's meaning.
Popular Models
OpenAI text-embedding-3: High quality, API-based. Cohere embed-v3: Multilingual. BGE/E5: Open-source, competitive. Nomic Embed: Open with long context. Sentence-Transformers: Open-source library with many models.
Choosing a Model
Consider: dimension size (affects storage), context length, multilingual support, domain specificity, latency requirements, and whether you need API access or local deployment. The MTEB leaderboard benchmarks embedding models.