ControlNet
A neural network that adds spatial conditioning controls (edges, poses, depth) to diffusion models.
Overview
ControlNet, introduced by Zhang et al. in 2023, enables precise spatial control over diffusion model outputs by conditioning generation on additional inputs like edge maps, depth maps, human pose skeletons, or segmentation maps. It creates a trainable copy of a diffusion model's encoder blocks connected via zero convolutions.
Key Details
This allows users to guide image generation with exact spatial layouts while maintaining the quality of the base model. For example, you can generate an image that follows a specific pose or matches the edges of a sketch. ControlNet has become essential for professional creative workflows with Stable Diffusion, enabling much more controlled and predictable generation.