LARYBench Collection Models trained in LARYBench. https://huggingface.co/papers/2604.11689 • 2 items • Updated 12 days ago
General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging Tasks Paper • 2604.11778 • Published 21 days ago • 9
EvalTalker: Learning to Evaluate Real-Portrait-Driven Multi-Subject Talking Humans Paper • 2512.01340 • Published Dec 1, 2025
LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment Paper • 2604.11689 • Published 21 days ago • 21
LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment Paper • 2604.11689 • Published 21 days ago • 21
LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment Paper • 2604.11689 • Published 21 days ago • 21
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning Paper • 2603.21065 • Published Mar 22 • 77
UNO-Bench: A Unified Benchmark for Exploring the Compositional Law Between Uni-modal and Omni-modal in OmniModels Paper • 2510.18915 • Published Oct 21, 2025 • 7