AI Glossary

Context Length Extrapolation

Techniques that allow models to handle longer sequences at inference than they saw during training.

Overview

Context length extrapolation enables language models to process sequences longer than those seen during training. Since training on very long sequences is computationally expensive, models are often trained on shorter contexts and then extended through various techniques at inference time.

Techniques

RoPE scaling: Adjusting rotary position embedding frequencies (NTK-aware scaling, YaRN). ALiBi: Linear bias attention that naturally extrapolates. Positional interpolation: Compressing position indices to fit within the trained range. Ring attention: Distributing long contexts across multiple devices. These techniques have enabled models trained on 4K or 8K contexts to handle 100K+ tokens at inference.

← Back to AI Glossary