merging - a leonardlin Collection

leonardlin 's Collections

merging

8b-class-japanese-models

speed

sota

evals

tuning

rag

context

safety

image

vision

code

prompt injection

TOREAD

data

voice

merging

updated 13 days ago

If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs

Paper • 2412.04144 • Published Dec 5, 2024 • 6
Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation

Paper • 2410.08371 • Published Oct 10, 2024 • 3
MERGE^3: Efficient Evolutionary Merging on Consumer-grade GPUs

Paper • 2502.10436 • Published Feb 9 • 1
Mergenetic: a Simple Evolutionary Model Merging Library

Paper • 2505.11427 • Published May 16 • 14
Evolutionary Optimization of Model Merging Recipes

Paper • 2403.13187 • Published Mar 19, 2024 • 58
Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning

Paper • 2410.10801 • Published Oct 14, 2024 • 3
SEA-LION: Southeast Asian Languages in One Network

Paper • 2504.05747 • Published Apr 8
What Matters for Model Merging at Scale?

Paper • 2410.03617 • Published Oct 4, 2024 • 9
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance

Paper • 2511.13254 • Published 20 days ago • 134
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

Paper • 2203.05482 • Published Mar 10, 2022 • 7
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation

Paper • 2502.17159 • Published Feb 24 • 2
Unconstrained Model Merging for Enhanced LLM Reasoning

Paper • 2410.13699 • Published Oct 17, 2024 • 1
Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement

Paper • 2408.03092 • Published Aug 6, 2024 • 1
Merging Smarter, Generalizing Better: Enhancing Model Merging on OOD Data

Paper • 2506.09093 • Published Jun 10
Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent

Paper • 2501.01230 • Published Jan 2
Realistic Evaluation of Model Merging for Compositional Generalization

Paper • 2409.18314 • Published Sep 26, 2024
Resolving Interference When Merging Models

Paper • 2306.01708 • Published Jun 2, 2023 • 15
Model Merging with Functional Dual Anchors

Paper • 2510.21223 • Published Oct 24 • 12
Activation-Informed Merging of Large Language Models

Paper • 2502.02421 • Published Feb 4 • 6
Expert Merging: Model Merging with Unsupervised Expert Alignment and Importance-Guided Layer Chunking

Paper • 2509.25712 • Published Sep 30 • 1
ATM: Improving Model Merging by Alternating Tuning and Merging

Paper • 2411.03055 • Published Nov 5, 2024 • 1
MergeBench: A Benchmark for Merging Domain-Specialized LLMs

Paper • 2505.10833 • Published May 16
Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging

Paper • 2503.20641 • Published Mar 26 • 10
REAP the Experts: Why Pruning Prevails for One-Shot MoE compression

Paper • 2510.13999 • Published Oct 15 • 5