AI Glossary

RoPE (Rotary Position Embedding)

A positional encoding method that encodes position information through rotation of the query and key vectors, enabling relative position awareness and length generalization.

How It Works

RoPE applies rotation matrices to query and key vectors before computing attention. The rotation angle depends on the token's position. This naturally encodes relative positions: the attention between two tokens depends on their distance, not absolute positions.

Why It's Popular

RoPE enables length generalization -- models can handle longer sequences than seen during training. It's also computationally efficient and compatible with KV caching. Used in LLaMA, Mistral, Qwen, and most modern LLMs.

Extensions

YaRN and NTK-aware scaling extend RoPE to handle very long contexts (e.g., going from 4K to 128K tokens). These techniques modify the rotation frequencies to work with longer sequences.

← Back to AI Glossary

RoPE (Rotary Position Embedding)

How It Works

Why It's Popular

Extensions

Related Articles

AI and Intellectual Property: Who Owns AI-Generated Content?

AI in Real Estate: Property Valuation and Market Prediction

Cross-Validation: How to Properly Test Your ML Models

Positional Encoding: How Transformers Understand Word Order

Related Concepts