If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs Paper • 2412.04144 • Published Dec 5, 2024 • 6
Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation Paper • 2410.08371 • Published Oct 10, 2024 • 3
MERGE^3: Efficient Evolutionary Merging on Consumer-grade GPUs Paper • 2502.10436 • Published Feb 9 • 1
Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning Paper • 2410.10801 • Published Oct 14, 2024 • 3
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published Nov 17 • 136
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time Paper • 2203.05482 • Published Mar 10, 2022 • 7
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation Paper • 2502.17159 • Published Feb 24 • 2
Unconstrained Model Merging for Enhanced LLM Reasoning Paper • 2410.13699 • Published Oct 17, 2024 • 1
Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement Paper • 2408.03092 • Published Aug 6, 2024 • 1
Merging Smarter, Generalizing Better: Enhancing Model Merging on OOD Data Paper • 2506.09093 • Published Jun 10
Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent Paper • 2501.01230 • Published Jan 2
Realistic Evaluation of Model Merging for Compositional Generalization Paper • 2409.18314 • Published Sep 26, 2024
Expert Merging: Model Merging with Unsupervised Expert Alignment and Importance-Guided Layer Chunking Paper • 2509.25712 • Published Sep 30 • 1
ATM: Improving Model Merging by Alternating Tuning and Merging Paper • 2411.03055 • Published Nov 5, 2024 • 1
Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging Paper • 2503.20641 • Published Mar 26 • 10
REAP the Experts: Why Pruning Prevails for One-Shot MoE compression Paper • 2510.13999 • Published Oct 15 • 6