Semantic Segmentation
A computer vision task that assigns a class label to every pixel in an image, creating a detailed understanding of the scene at pixel-level granularity.
How It Works
An encoder (CNN or ViT) extracts features at multiple scales. A decoder upsamples features back to the original image resolution. Each pixel is classified into a category (road, person, sky, building). The output is a segmentation map.
Key Architectures
U-Net: Skip connections between encoder and decoder (medical imaging standard). DeepLab: Atrous convolutions for multi-scale features. Segment Anything (SAM): Meta's foundation model for universal segmentation.
Applications
Autonomous driving (understanding road scenes). Medical imaging (organ and tumor segmentation). Satellite imagery (land use classification). Augmented reality (separating foreground from background). Agricultural monitoring.