← Dictionary
AInoun

Fine-Tuning

/faɪn ˈtjuːnɪŋ/

Adapting an existing AI model by training it further on specific data.

Definition

Fine-tuning is the process of taking a pre-trained AI model and continuing to train it on a smaller, domain-specific dataset — adapting its behaviour, style, or knowledge to a particular use case without training a model from scratch.

Fine-tuning was the default customisation approach in the GPT-3 era. With the rise of much larger models and retrieval-augmented architectures (RAG), most production teams now reach for prompting and RAG before fine-tuning — they're cheaper, faster to iterate, and produce results that are easier to inspect and update.

Fine-tuning still has legitimate uses: enforcing very specific output formats, adapting model style to a brand voice, training on proprietary task data where prompt engineering hits a ceiling. But it's no longer the first move; it's a specialised tool.

Origin

Fine-tuning as a transfer-learning technique predates LLMs — it's been standard practice in deep learning since ~2014. The OpenAI fine-tuning API (2021) brought it to mainstream LLM use; the technique remains essential in computer vision and speech.

How it works

  1. Determine whether fine-tuning is actually needed (try prompting and RAG first).
  2. Build a high-quality dataset (typically 50-1,000 examples for instruction fine-tuning).
  3. Choose the base model (GPT-4o-mini, Llama, Claude, etc. — vendor support varies).
  4. Run the fine-tuning job (most major vendors offer managed fine-tuning).
  5. Evaluate the fine-tuned model against the base model on a held-out test set.
  6. Deploy with monitoring; budget for ongoing retraining as data drifts.

When to use it

Use when

  • When prompting and RAG can't achieve the required output quality or style.
  • For domain-specific task formats with proprietary data.
  • When latency or cost demands a smaller fine-tuned model.

Skip when

  • For general-purpose tasks — the base models are very strong.
  • When the dataset is small (under 50 examples).
  • Before exhausting prompt engineering and RAG.

Key metrics

Examples

In practice at Makreate

Makreate AI engagements treat fine-tuning as a specialised tool, not a default. We typically exhaust prompting and RAG before recommending fine-tuning, because the iteration speed of prompting is materially faster and the quality gap has narrowed substantially with modern models. When fine-tuning is genuinely the right tool — for very specific output formats or significant cost optimisation — we build the eval framework first and the fine-tune second.

AI Web App Development →

Common mistakes

Frequently asked

How many examples do I need to fine-tune?

Highly variable. Instruction fine-tuning often works with 50-500 examples. Classification tasks may need 1,000+. Style adaptation often needs surprisingly few (50-200).

Fine-tune or use RAG?

RAG for current/proprietary knowledge; fine-tuning for output style or format. They can also be combined.

Should I fine-tune the latest model or a smaller one?

For cost-sensitive workloads, fine-tune the smallest model that meets quality bars. For quality-sensitive workloads, prompt the largest available model first.

Related terms

WhatsApp