LoRA Adapters
Lightweight trainable modules that adapt pre-trained models to new tasks using low-rank matrix decomposition.
Overview
LoRA (Low-Rank Adaptation) adapters are small, trainable modules inserted into a frozen pre-trained model. Instead of fine-tuning all model weights, LoRA adds pairs of low-rank matrices (A and B) to existing weight matrices, where the rank is much smaller than the original dimensions, dramatically reducing the number of trainable parameters.
Key Details
LoRA adapters typically represent 0.1-1% of the original model's parameters but can match full fine-tuning performance on many tasks. Multiple adapters can be trained for different tasks and hot-swapped at inference time. Variants include QLoRA (combining quantization with LoRA for even lower memory), DoRA (decomposed weight-norm adaptation), and rsLoRA (rank-stabilized LoRA). LoRA has become the standard approach for efficiently customizing large language models.
Related Concepts
lora • fine tuning • peft