TrainingLow-Rank Adaptation

LoRA

Low-Rank Adaptation, a parameter-efficient fine-tuning method that updates a small set of low-rank matrices instead of the full model.

LoRA, short for Low-Rank Adaptation, is a technique for adapting large models without retraining all their parameters. Instead of modifying the full weight matrices, LoRA adds small trainable low-rank matrices on top of existing layers.

This dramatically reduces memory use and training cost while still achieving strong task-specific performance. It made fine-tuning large open models far more accessible to smaller teams.

Main benefit: you keep the original model mostly frozen and train only a lightweight adapter, which is cheaper, faster, and easier to share.

Why LoRA Became Popular

Lower GPU requirements — enables practical fine-tuning on modest hardware
Fast experimentation — train small adapters for many use cases
Easy switching — load different adapters onto the same base model
Strong performance — often close to full fine-tuning for specific tasks

LoRA is now a standard method in open-source model workflows. It is especially helpful when customizing pre-trained models for domain-specific applications.

Related Terms

← Back to Glossary