MapReduce
A programming model for processing large datasets in parallel across a cluster, and in AI contexts, a pattern for processing data that exceeds a model's context window.
In LLM Applications
When summarizing a document too long for the context window, the MapReduce pattern splits it into chunks (map), processes each chunk independently, then combines the results (reduce). This is a common pattern in LangChain and similar frameworks.
Map Phase
Each chunk is processed independently -- summarized, analyzed, or transformed. This step is parallelizable, making it efficient for large documents.
Reduce Phase
The individual chunk results are combined into a final output. This might involve summarizing the summaries, merging extracted entities, or aggregating answers.