How Fast Should a Model Commit to Supervision? Training Reasoning Models on the Tsallis Loss Continuum Paper • 2604.25907 • Published 11 days ago • 3
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 324
Graph-Based Chain-of-Thought Pruning for Reducing Redundant Reflections in Reasoning LLMs Paper • 2604.05643 • Published Apr 7 • 13
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 627
qingwuuu/ngld-grape-leaf-vlm-w-img-without-diff-ref-v4 Viewer • Updated about 1 month ago • 7.3k • 123 • 2
UniMixer: A Unified Architecture for Scaling Laws in Recommendation Systems Paper • 2604.00590 • Published Apr 1 • 8
TAPS: Task Aware Proposal Distributions for Speculative Sampling Paper • 2603.27027 • Published Mar 27 • 143