AI Glossary

Test-Time Compute

Using additional computation during inference (not just training) to improve model outputs, through techniques like chain-of-thought, self-reflection, or search.

The Idea

Rather than generating a quick answer, spend more compute at inference time to 'think harder.' This can mean generating multiple candidates and selecting the best, using chain-of-thought reasoning, or iteratively refining outputs.

Techniques

Best-of-N sampling: Generate N responses, score them, return the best. Self-consistency: Sample multiple reasoning chains, take majority answer. Tree search: Explore multiple reasoning paths. Self-reflection: Have the model critique and revise its own output.

Impact

Models like o1 and DeepSeek-R1 use test-time compute to dramatically improve reasoning performance. This represents a shift from 'bigger model' to 'more thinking per problem' as a scaling axis.

← Back to AI Glossary

Last updated: March 5, 2026