Running on CPU Upgrade Featured 2.57k The Smol Training Playbook ๐ 2.57k The secrets to building world-class LLMs
A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports Paper โข 2510.02190 โข Published Oct 2 โข 18
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Paper โข 2407.12772 โข Published Jul 17, 2024 โข 35