AI Glossary

Long-Context Model

An LLM designed to process very long input sequences, from 128K tokens to over 1 million tokens.

Overview

Long-context models are language models that can process input sequences far longer than traditional limits. Models like Claude (200K tokens), Gemini 1.5 (1M+ tokens), and GPT-4o (128K tokens) can process entire books, codebases, or document collections in a single prompt.

Technical Approaches

Enabling long contexts requires architectural innovations like rotary position embeddings (RoPE) with scaling, flash attention for memory efficiency, ring attention for distributed processing, and efficient KV-cache management. The 'needle in a haystack' test evaluates how well models recall information from different positions in long contexts.

← Back to AI Glossary

Last updated: March 5, 2026