AI Glossary

Mixed-Precision Training

Training neural networks using both 16-bit and 32-bit floating-point numbers to reduce memory and increase speed.

Overview

Mixed-precision training uses a combination of FP16 (half-precision) and FP32 (full-precision) floating-point arithmetic during neural network training. Most operations use FP16 for speed and memory savings, while a master copy of weights is maintained in FP32 for numerical stability during parameter updates.

Key Details

This technique typically provides 2-3x speedups and halves memory usage with minimal accuracy loss. Loss scaling prevents FP16 underflow by multiplying the loss by a large factor before backpropagation. Mixed precision is standard practice for training large models on modern GPUs (NVIDIA Tensor Cores are optimized for it) and is supported by frameworks like PyTorch AMP and TensorFlow mixed precision.

Related Concepts

fp16distributed traininggpu

← Back to AI Glossary

Last updated: March 5, 2026