AI Glossary

Parameter (Model Weight)

The learnable values within a neural network that are adjusted during training to minimize the loss function. Model size is typically measured in parameter count.

What Parameters Are

Parameters are the numbers that define what a model has learned. In a neural network, they are the weights and biases of each layer. A 70B parameter model has 70 billion individual numbers that were optimized during training.

Parameters vs. Hyperparameters

Parameters are learned automatically during training (weights, biases). Hyperparameters are set manually before training (learning rate, batch size, architecture choices). This is a crucial distinction.

Scale

GPT-3: 175B parameters. GPT-4: estimated 1.8T parameters (MoE). LLaMA 3: 8B to 405B. The relationship between parameter count and capability is complex -- architecture, training data, and training compute all matter.

← Back to AI Glossary

Parameter (Model Weight)

What Parameters Are

Parameters vs. Hyperparameters

Scale

Related Articles

Hyperparameter Tuning: Grid Search, Random Search, and Bayesian Optimization

Fine-Tuning LLMs with LoRA and QLoRA: A Practical Guide

LLM Quantization: Running Large Models on Small Hardware

LLM Scaling Laws: Bigger Models, Better Performance?

Small Language Models: When Bigger Isn't Better

Related Concepts