Rethinking Adapter Composition in Language Models

Large language models (LLMs) have transformed how we approach tasks across diverse domains. Yet, access control remains a challenge, especially when striving for modularity without retraining or cross-domain interference. This issue comes to the forefront with a recent study on adapter composition in these models.

The Hypothesis Tested

The research focuses on the DoRA-RBAC framework, which utilizes hierarchical adapter composition based on weight-decomposed low-rank adaptation. The key hypothesis? Interference arises from overlapping linear parameter updates. If true, enforcing orthogonality or directional independence should, in theory, boost multi-domain performance. However, the results tell a different story.

Evaluating Merging Strategies

Researchers compared conventional Euclidean merging with a geometry-aware Riemannian-inspired approach. The latter attempts to approximate the Frechet mean via normalized directional averaging. Benchmarks included GPQA, PubMedQA, SimpleQA, and WMDP, evaluated on LLaMA-3.1-8B and Mistral-7B models. Surprisingly, while single-domain performance aligned with LoRA's results, the geometry-aware strategy didn't consistently outperform standard averaging in multi-domain scenarios.

Rethinking Parameter-Space Geometry

The ablation study reveals a compelling insight: angular alignment and orthogonality of adapter updates are weak indicators of composition performance. This finding suggests that adapter interference isn't primarily about parameter-space geometry. Instead, the interactions in shared nonlinear representations take precedence. This challenges prevailing assumptions. Should we rethink how we approach modularity in LLMs?

The Bigger Picture

This study prompts essential questions about the current methodologies in LLMs. If parameter-space geometry isn’t the culprit for interference, what's? The answer might lie in more nuanced aspects of model architecture or data distribution. The paper's key contribution is in providing a fresh perspective that could lead to more effective strategies for multi-domain language models.

The findings call for a reevaluation of how modular mechanisms are employed. As models grow in complexity and application domains expand, understanding these interactions becomes important. It's not just about the technicalities. it's about paving the way for more versatile and scalable LLMs.