TrainingLow-Rank Adaptation
LoRA
Low-Rank Adaptation, a parameter-efficient fine-tuning method that updates a small set of low-rank matrices instead of the full model.
LoRA, short for Low-Rank Adaptation, is a technique for adapting large models without retraining all their parameters. Instead of modifying the full weight matrices, LoRA adds small trainable low-rank matrices on top of existing layers.
This dramatically reduces memory use and training cost while still achieving strong task-specific performance. It made fine-tuning large open models far more accessible to smaller teams.
Main benefit: you keep the original model mostly frozen and train only a lightweight adapter, which is cheaper, faster, and easier to share.
Why LoRA Became Popular
- Lower GPU requirements — enables practical fine-tuning on modest hardware
- Fast experimentation — train small adapters for many use cases
- Easy switching — load different adapters onto the same base model
- Strong performance — often close to full fine-tuning for specific tasks
LoRA is now a standard method in open-source model workflows. It is especially helpful when customizing pre-trained models for domain-specific applications.