AI Glossary

Temperature (in LLMs)

A parameter that controls the randomness of language model outputs. Lower temperature produces more focused, deterministic responses; higher temperature produces more creative, diverse ones.

How It Works

Temperature scales the logits (raw prediction scores) before the softmax function. Temperature 0: always picks the most likely token (deterministic). Temperature 0.7: balanced creativity. Temperature 1.0: samples from the full learned distribution. Temperature >1: increasingly random.

Choosing the Right Temperature

Low (0-0.3): Factual Q&A, code generation, math. Medium (0.5-0.8): General conversation, balanced creativity. High (0.8-1.2): Creative writing, brainstorming, poetry.

Interaction with Top-P

Temperature is often combined with top-p (nucleus) sampling, which limits sampling to the smallest set of tokens whose cumulative probability exceeds p. Together, they give fine-grained control over generation diversity.

← Back to AI Glossary

Temperature (in LLMs)

How It Works

Choosing the Right Temperature

Interaction with Top-P

Related Articles

Self-Consistency Prompting: Improving AI Reliability

Related Concepts