Topic Hub

Fine-Tuning LLMs

Fine-tuning adapts a pre-trained language model to your specific domain, style, or task. Learn when it makes sense, how to prepare your data, and how to use efficient techniques like LoRA and QLoRA to train models on consumer hardware.


What Is Fine-Tuning?

Pre-trained LLMs like Llama 3, Mistral, and Gemma are trained on vast general-purpose datasets. Fine-tuning is a second training stage on a smaller, task-specific dataset that adjusts the model's weights to excel at your particular use case — whether that's writing in your brand voice, following a specific output format, or mastering a technical domain.

Modern fine-tuning is far more accessible than it sounds. Parameter-efficient methods like LoRA mean you can fine-tune a high-quality 7B model on a single consumer GPU in a few hours.

Instruction Tuning

Train the model to follow instructions using (instruction, response) pairs. Creates chat-capable, assistant-style models.

LoRA

Low-Rank Adaptation trains small adapter matrices. Reduces GPU memory 10×+ with minimal quality loss.

QLoRA

Quantized LoRA combines 4-bit quantization with LoRA adapters. Fine-tune 7B models on a single 16 GB GPU.

RLHF

Reinforcement Learning from Human Feedback aligns model behaviour with human preferences. Used in ChatGPT, Claude, Gemini.

DPO

Direct Preference Optimisation is a simpler RLHF alternative. Trains on preference pairs without a reward model.

Domain Adaptation

Continue pre-training on domain-specific text (medical, legal, code) so the model learns specialised vocabulary and facts.


When Fine-Tuning Is the Right Choice

Fine-tuning is not always the answer — but when it is, it delivers capabilities that prompting simply cannot match:

Scenario Recommendation
Prototype or early productPrompt engineering first
Need consistent output formatFine-tuning (+ structured output)
Knowledge changes frequentlyRAG, not fine-tuning
Need specific persona or styleFine-tuning
Fewer than 100 examplesFew-shot prompting
500+ high-quality examplesFine-tuning likely worth it
Cost-sensitive at high volumeFine-tune a smaller model

Phase 6 of the AI Engineering Roadmap

Fine-Tuning LLMs is Phase 6 of the AI roadmap for developers. It comes after you have mastered the core LLM development stack:

Phase 3 Prompt Engineering → understand what models respond to
Phase 4–5 RAG + Agents → build production AI applications
Phase 6 ★ Fine-Tuning LLMs ← you are here
Phase 7 Real-world Projects → ship production AI systems

Fine-tuning is an advanced topic — you will get the most out of it once you understand model behaviour through prompting, have a clear task definition, and have a quality dataset. Do not skip phases 3–5.


Fine-Tuning Guides & Deep-Dives

From understanding transformer internals to running LoRA fine-tuning on open-source models — practical guides for every stage of the process.


Continue Learning

Ready to go deeper into LLMs?

Follow the complete AI engineering roadmap — from API basics to fine-tuning your own models.

View the full AI roadmap →