AI Glossary

Stable Diffusion

An open-source text-to-image generation model that works by diffusing and denoising in a compressed latent space, making it faster and more accessible than pixel-space diffusion.

Architecture

Stable Diffusion uses a Variational Autoencoder to compress images into a latent space, a U-Net that performs the denoising in that latent space, and a CLIP text encoder that conditions the generation on text prompts.

Why Latent Diffusion?

Operating in latent space (8x smaller than pixel space) dramatically reduces compute requirements. This made high-quality image generation accessible on consumer GPUs, democratizing AI art creation.

Ecosystem

The open release of Stable Diffusion created a massive ecosystem: ComfyUI, Automatic1111, custom models, LoRA fine-tuning, ControlNet for guided generation, and thousands of community-trained checkpoints.

← Back to AI Glossary

Stable Diffusion

Architecture

Why Latent Diffusion?

Ecosystem

Related Articles

Stable Diffusion: How Text-to-Image AI Works Under the Hood

DALL-E vs Midjourney vs Stable Diffusion: Which Is Best?

AI Image Generation: How Machines Create Art

Related Concepts