AI teaches itself how to stop forgetting.

Prasanth Aby Thomas
5 Min Read

Large Language Models (LLMs) often lose previously learned skills when adapted for new tasks. A novel self-distillation methodology aims to mitigate this skill regression and streamline model maintenance.

AI, artificial intelligence
Credit: Shutterstock

A novel fine-tuning strategy aims to conquer “catastrophic forgetting,” a common obstacle hindering repeated model updates in business environments.

Researchers from MIT, the Improbable AI Lab, and ETH Zurich have unveiled a new fine-tuning method that allows models to absorb new tasks without sacrificing their previously acquired abilities.

To avoid degrading existing functionalities, many organizations currently keep new tasks separate, employing distinct fine-tuned models or adapters. This fragmentation inflates costs and complicates governance, demanding continuous retesting to guard against performance drops.

The new technique, dubbed self-distillation fine-tuning (SDFT), is designed to resolve this dilemma.

The researchers explain that SDFT “leverages in-context learning by using a demonstration-conditioned model as its own teacher, generating on-policy training signals that preserve prior capabilities while acquiring new skills.”

They further noted that it consistently outperforms Supervised Fine Tuning (SFT) “across skill learning and knowledge acquisition tasks,” delivering enhanced accuracy on new tasks “while substantially reducing catastrophic forgetting.”

Through their experiments, the researchers observed that this method allows a single model to progressively build new skills without suffering performance degradation on existing ones. This breakthrough could greatly simplify how businesses update and specialize their production models over time.

The need and the solution

Despite rapid advancements in foundational models, most enterprise AI systems remain static after deployment. While prompting and retrieval can modify behavior during inference, the model’s core parameters don’t adapt to internalize new knowledge or skills.

Consequently, each subsequent fine-tuning cycle risks catastrophic forgetting, where improvements on a fresh task lead to a decline in performance on earlier, established ones.

“To power the next generation of foundation models, we must conquer the challenge of continual learning: enabling AI systems to continuously learn and evolve, much like humans acquire knowledge and refine skills throughout their lives,” the researchers emphasized.

Reinforcement learning offers a pathway to train models on data generated by their own policy, which helps mitigate forgetting. However, this typically demands explicit reward functions, which are not always straightforward to define.

SDFT proposes an alternative. Instead of deriving a reward function, it harnesses the model’s innate in-context learning capabilities to generate on-policy learning signals directly from demonstrations.

During training, the same model assumes a dual role. A “teacher” version is presented with both the query and expert examples. A “student” version, mirroring real-world application, sees only the query. The student then adjusts its parameters to align with the teacher’s predictions based on its own generated outputs.

“In sequential learning experiments, SDFT allows a single model to incrementally acquire multiple skills over time without performance regression, establishing on-policy distillation as a viable approach for continual learning from demonstrations,” the researchers concluded.

Challenges to overcome

SDFT appears highly promising, especially as it eliminates the need to maintain extensive “model zoos” of individual adapters or fine-tuned variations, according to Lian Jye Su, chief analyst at Omdia.

However, its widespread commercial adoption hinges on overcoming several persistent challenges.

For example, SDFT demands significantly more training time and approximately 2.5 times the computational resources compared to standard SFT. It also relies on sufficiently capable base models possessing robust in-context learning abilities.

Sanchit Vir Gogia, chief analyst at Greyhound Research, also cautioned that SDFT doesn’t negate the necessity for robust regression testing infrastructure. Since the model learns from its own generated outputs, businesses must guarantee reproducibility through stringent version control and artifact logging.

“Consolidating models shifts operational complexity from managing sheer numbers to ensuring deep governance,” Gogia stated.

These costs can be justified, Su argues, by preventing the catastrophic loss of crucial context and bypassing the complexity of defining reward functions in reinforcement learning. Nevertheless, it may take some time before this method reaches mainstream enterprise use. “SDFT will likely first find experimentation in internal developer tools and general assistants, where the risks associated with a ‘self-taught error’ are lower than in highly regulated sectors such as financial or medical decision-making,” observed Faisal Kawoosa, founder and lead analyst at Techarc.

Artificial IntelligenceSoftware Development
Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *