What’s the best strategy for fine-tuning a large language model (LLM) on domain-specific data without catastrophic forgetting?

Suhebmultani · October 10, 2025, 11:10am

How can we fine-tune a large model for a specific domain while still keeping its general knowledge and skills?

John6666 · October 11, 2025, 6:50am

Methods include teaching while incorporating conventional knowledge during fine-tuning and excluding (freezing) certain parts of the network from fine-tuning, and these are often used in combination.

AshwiShan · June 3, 2026, 8:23pm

Good question, the answer depends on which version of “keep general knowledge” you actually need, and it’s worth separating the two:

You need the facts available but not reasoned over → don’t fine-tune, use RAG. Retrieval leaves the base model untouched, so there’s zero forgetting by definition. Best when your domain is mostly lookup.
You need the model to actually internalize the domain (reason and generate in it) → you have to touch the weights, and that’s where forgetting bites. The standard tools (as John6666 said) are mixing general data back in during training + freezing layers. They work, but they’re fiddly — you keep a general-data replay set around and tune how much to mix, and freezing trades away how much the model can actually learn.

A lower-friction route in that second case is a constrained adapter: instead of replaying or penalizing to recover general knowledge after the fact, you constrain the update so it can’t overwrite base capabilities in the first place. No replay set, no penalty coefficient to tune. In our continual-learning tests this held ~0% drift on prior knowledge while still learning the new domain, where plain sequential LoRA degraded several hundred percent.

If you share your model + domain I’m happy to point you at the exact setup, or run a quick before/after so you can see the retention numbers on your own data.

-– Ashwin (working on this at ModelBrew — modelbrew.ai)

Topic		Replies	Views
Adding domain knowledge in LLMs via fine tuning Research	2	5863	July 23, 2023
Your Fine-Tuned Model Forgot Everything It Knew. Here’s Why Research	0	90	March 10, 2026
Zero Forgetting Across 4 Benchmarks on Mistral-7B — Interactive Results Dashboard Research	2	49	March 11, 2026
How can LLMs be fine-tuned for specialized domain knowledge? 🤗Transformers	3	1447	May 29, 2026
New work in LoRA and Adaptation in Specialized/Specific Domains - Requested arXiv Endorsement Research	4	89	November 18, 2025

What’s the best strategy for fine-tuning a large language model (LLM) on domain-specific data without catastrophic forgetting?

Related topics