Adapter
Small, trainable modules inserted into a pre-trained model that enable task-specific fine-tuning without modifying the original model weights.
How They Work
Adapters are small neural network layers (typically bottleneck architectures) inserted between existing layers. During fine-tuning, only the adapter parameters are trained while the base model is frozen.
Benefits
Multiple adapters for different tasks can share the same base model. Switching tasks means swapping adapters (milliseconds), not loading entire models. Storage is minimal since adapters are tiny compared to the base model.
Relation to LoRA
LoRA is a specific type of adapter that uses low-rank matrix decomposition. Other adapter methods include prefix tuning, prompt tuning, and (IA)^3. All fall under the umbrella of parameter-efficient fine-tuning (PEFT).