DigiNews

Tech Watch by Johan Denoyer

← Back to articles

LoRA and Weight Decay

Quality: 8/10 Relevance: 9/10

Summary

LoRA does not perfectly emulate full finetuning because weight decay interacts with the adapter-based setup in a way that biases updates toward the original frozen weights rather than toward zero. The post analyzes why LoRA solves a different optimization problem, derives the gradient dynamics, and proposes a concrete modification to weight decay to align LoRA more closely with full finetuning if desired. It also discusses practical considerations for momentum-based optimizers and provides code-oriented guidance.

🚀 Service construit par Johan Denoyer