AI Glossary

Test-Time Training

Adapting a model's parameters on each test input to improve predictions at inference time.

Overview

Test-time training (TTT) is a technique where a model's parameters are temporarily updated for each test input using a self-supervised objective, improving predictions on that specific input. Unlike standard inference (fixed weights) or fine-tuning (permanent updates), TTT adapts ephemerally to each input's characteristics.

Key Details

Recent TTT architectures replace the attention mechanism's fixed linear layers with mini neural networks that are trained (via gradient descent) on the test sequence itself. This allows the 'context window' to scale linearly rather than quadratically. TTT-Linear and TTT-MLP have shown competitive performance with transformers while offering better scaling for very long sequences. This approach bridges the gap between in-context learning and fine-tuning.

Related Concepts

test time compute • fine tuning • in context learning

← Back to AI Glossary