Neither keyword search nor semantic search alone is perfect. Keyword search excels at finding exact terms and specific identifiers but misses semantically similar content phrased differently. Semantic search understands meaning and intent but can miss exact matches and struggle with proper nouns, acronyms, and technical terms. Hybrid search combines both approaches to create a retrieval system that is greater than the sum of its parts.
Understanding the Two Search Paradigms
Keyword Search (Sparse Retrieval)
Traditional keyword search, powered by algorithms like BM25, matches documents based on shared terms. It works by counting how often query terms appear in documents, adjusted for document length and term rarity. BM25 has been the backbone of search engines for decades and remains remarkably effective for many queries.
Strengths of keyword search include excellent performance for exact matches, no embedding infrastructure required, fast and well-understood, and strong handling of proper nouns, product codes, and domain-specific terminology. Weaknesses include inability to match semantically similar but differently worded content, sensitivity to synonyms and paraphrasing, and no understanding of query intent.
Semantic Search (Dense Retrieval)
Semantic search uses embedding models to convert text into dense vectors and finds matches based on vector similarity. It understands that "automobile" and "car" mean the same thing, and that "how to fix a leaky faucet" is related to "plumbing repair guide" even though they share no words.
Strengths include understanding of meaning and intent, robustness to paraphrasing and synonyms, and ability to find conceptually related content. Weaknesses include potential to miss exact keyword matches, difficulty with rare terms and proper nouns, and dependence on embedding model quality.
"Hybrid search is not a compromise between two imperfect systems. It is a combination that eliminates the blind spots of each individual approach, producing retrieval that is more accurate and more robust than either alone."
How Hybrid Search Works
The standard hybrid search architecture runs both search methods in parallel and combines their results:
- Run keyword search: Execute a BM25 query against a full-text index of your documents.
- Run semantic search: Embed the query and perform a nearest-neighbor search against your vector database.
- Normalize scores: Each method produces scores on different scales. Normalize both to a common range (typically 0 to 1).
- Combine results: Merge the two result sets using a fusion algorithm, weighting each method according to its importance for your use case.
- Return top-k: Select the top-k documents from the fused results.
Reciprocal Rank Fusion (RRF)
The most popular fusion algorithm is Reciprocal Rank Fusion, which combines ranked lists by assigning a score based on each document's position in each list:
RRF_score(d) = sum(1 / (k + rank_i(d))) for each ranking i
where:
- d is a document
- k is a constant (typically 60)
- rank_i(d) is document d's position in ranking i
RRF is effective because it does not require score normalization; it works purely on rank positions. Documents that appear high in both keyword and semantic rankings receive the highest combined scores.
Key Takeaway
Hybrid search with RRF is the most reliable default retrieval strategy for RAG systems. It handles a wider range of query types than either keyword or semantic search alone.
Implementing Hybrid Search
Several tools make hybrid search straightforward to implement:
- Weaviate: Offers built-in hybrid search with configurable alpha parameter to weight between BM25 and vector search.
- Elasticsearch: Combines its mature BM25 engine with dense vector search in a single query using the kNN feature.
- Qdrant: Supports sparse vectors (for BM25-like retrieval) alongside dense vectors for true hybrid search.
- LangChain EnsembleRetriever: Combines any two retrievers (e.g., BM25 + FAISS) with RRF fusion.
from langchain.retrievers import BM25Retriever, EnsembleRetriever
from langchain.vectorstores import FAISS
# Create individual retrievers
bm25_retriever = BM25Retriever.from_documents(documents)
bm25_retriever.k = 5
faiss_retriever = faiss_vectorstore.as_retriever(search_kwargs={"k": 5})
# Combine with ensemble
hybrid_retriever = EnsembleRetriever(
retrievers=[bm25_retriever, faiss_retriever],
weights=[0.4, 0.6] # weight semantic search slightly higher
)
Tuning the Balance
The relative weight between keyword and semantic search significantly affects results. Some guidelines:
- Weight keyword search higher when your domain has many unique terms, acronyms, or identifiers that must match exactly.
- Weight semantic search higher when users ask natural language questions and the same concept appears in many different phrasings.
- Use equal weights as a reasonable starting point, then adjust based on evaluation results.
Beyond Simple Hybrid Search
Advanced hybrid search systems add additional layers to further improve retrieval quality. Re-ranking uses a cross-encoder model to rescore the fused results for more precise relevance ordering. Query routing uses a classifier to determine whether a specific query would benefit more from keyword or semantic search, dynamically adjusting the weights. Learned sparse representations, like SPLADE, create sparse vectors that combine the advantages of keyword matching with learned term importance weights.
Key Takeaway
Start with a simple hybrid search implementation using RRF fusion and equal weights. Evaluate on your actual queries, then tune the balance and add re-ranking only if the evaluation shows room for improvement.
