AI Glossary

Skip Connection (Residual Connection)

A shortcut that adds a layer's input directly to its output, allowing gradients to flow through deep networks and enabling the training of very deep architectures.

How It Works

Instead of learning a function F(x), the layer learns a residual F(x) + x. The skip connection adds the original input x to the layer's output. If the optimal function is close to identity, the layer only needs to learn a small residual.

Why It's Essential

Without skip connections, deep networks suffer from vanishing gradients (gradients shrink as they propagate backward through many layers). Skip connections provide a direct gradient path, enabling networks with hundreds or thousands of layers.

Ubiquity

Skip connections are used in virtually every modern architecture: ResNet (introduced them), transformers (in every attention and FFN block), U-Net, and DenseNet. They are one of the most important architectural innovations in deep learning.

← Back to AI Glossary

Last updated: March 5, 2026