Mix-MoE: Improving Multilingual Machine Translation of Large Language Models through Mixed MoEs Paper • 2605.24681 • Published 10 days ago • 5
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 6 days ago • 413
DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning Paper • 2605.25604 • Published 8 days ago • 133
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 21 days ago • 195
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 13 days ago • 204
OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond Paper • 2605.19660 • Published 14 days ago • 40
Who Prices Cognitive Labor in the Age of Agents? Compute-Anchored Wages Paper • 2605.05558 • Published 25 days ago • 3
sentence-transformers/all-mpnet-base-v2 Sentence Similarity • 0.1B • Updated Aug 19, 2025 • 35.8M • • 1.3k
Leveraging Verifier-Based Reinforcement Learning in Image Editing Paper • 2604.27505 • Published Apr 30 • 57
Experience Transfer for Multimodal LLM Agents in Minecraft Game Paper • 2604.05533 • Published Apr 7 • 16
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 504
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 630