🗂️ Biology & Life Sciences · View mindmap

Low-Rank Adaptation (LoRA)

Low-Rank Adaptation is a Parameter-Efficient Fine-Tuning technique that freezes pre-trained model weights and injects trainable low-rank decomposition matrices into specific layers. Instead of updating the full weight matrix $W$ , LoRA learns a delta $Δ W = A \times B$ , where $A \in R^{r \times d}$ and $B \in R^{m \times r}$ with rank $r ≪ min (m, d)$ . This approach drastically reduces trainable parameters and memory footprint, prevents catastrophic forgetting, and achieves performance parity with full fine-tuning across diverse tasks.

Mechanism

Replaces weight updates with low-rank factorization, storing only the small matrices $A$ and $B$ rather than the full weight delta.
Integrates seamlessly with existing Transformer layers by adding the low-rank update to the frozen pre-trained weights during inference ( $W_{n e w} = W_{f roze n} + Δ W$ ).

Context & Resources

See Low-Rank Adaptation (LoRA) for Efficient AI Model Fine-Tuning for a detailed overview of Parameter-Efficient Adaptation (PEA) techniques and the computational benefits of LoRA.
Low-Rank Adaptation (LoRA) for Efficient AI Model Fine-Tuning (Video by Jia-Bin Huang)

NemoClaw Knowledge Wiki

Explorer

low-rank-adaptation

Low-Rank Adaptation (LoRA)

Mechanism

Context & Resources

Graph View

Table of Contents

Backlinks