Fine-Tuning LLMs
Fine-tuning adapts a pre-trained language model to your specific domain, style, or task. Learn when it makes sense, how to prepare your data, and how to use efficient techniques like LoRA and QLoRA to train models on consumer hardware.
What it is
What Is Fine-Tuning?
Pre-trained LLMs like Llama 3, Mistral, and Gemma are trained on vast general-purpose datasets. Fine-tuning is a second training stage on a smaller, task-specific dataset that adjusts the model's weights to excel at your particular use case — whether that's writing in your brand voice, following a specific output format, or mastering a technical domain.
Modern fine-tuning is far more accessible than it sounds. Parameter-efficient methods like LoRA mean you can fine-tune a high-quality 7B model on a single consumer GPU in a few hours.
Instruction Tuning
Train the model to follow instructions using (instruction, response) pairs. Creates chat-capable, assistant-style models.
LoRA
Low-Rank Adaptation trains small adapter matrices. Reduces GPU memory 10×+ with minimal quality loss.
QLoRA
Quantized LoRA combines 4-bit quantization with LoRA adapters. Fine-tune 7B models on a single 16 GB GPU.
RLHF
Reinforcement Learning from Human Feedback aligns model behaviour with human preferences. Used in ChatGPT, Claude, Gemini.
DPO
Direct Preference Optimisation is a simpler RLHF alternative. Trains on preference pairs without a reward model.
Domain Adaptation
Continue pre-training on domain-specific text (medical, legal, code) so the model learns specialised vocabulary and facts.
Why it matters
When Fine-Tuning Is the Right Choice
Fine-tuning is not always the answer — but when it is, it delivers capabilities that prompting simply cannot match:
- Style and tone consistency — bake your brand voice or writing style directly into the model
- Format adherence — models learn to reliably output specific JSON schemas, code formats, or document structures
- Reduced prompt length — move few-shot examples into weights, cutting token costs at scale
- Domain expertise — models learn specialised terminology, reasoning patterns, and domain knowledge
- Privacy — run a fine-tuned open-source model entirely on your own infrastructure
Fine-tuning vs. prompting: when to use each
| Scenario | Recommendation |
|---|---|
| Prototype or early product | Prompt engineering first |
| Need consistent output format | Fine-tuning (+ structured output) |
| Knowledge changes frequently | RAG, not fine-tuning |
| Need specific persona or style | Fine-tuning |
| Fewer than 100 examples | Few-shot prompting |
| 500+ high-quality examples | Fine-tuning likely worth it |
| Cost-sensitive at high volume | Fine-tune a smaller model |
Where it fits in the AI roadmap
Phase 6 of the AI Engineering Roadmap
Fine-Tuning LLMs is Phase 6 of the AI roadmap for developers. It comes after you have mastered the core LLM development stack:
Fine-tuning is an advanced topic — you will get the most out of it once you understand model behaviour through prompting, have a clear task definition, and have a quality dataset. Do not skip phases 3–5.
Tutorials on this site
Fine-Tuning Guides & Deep-Dives
From understanding transformer internals to running LoRA fine-tuning on open-source models — practical guides for every stage of the process.
Related topic hubs
Continue Learning
Ready to go deeper into LLMs?
Follow the complete AI engineering roadmap — from API basics to fine-tuning your own models.
View the full AI roadmap →