Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models Paper • 2604.08545 • Published 3 days ago • 33
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published 17 days ago • 126
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published 17 days ago • 126
Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation Paper • 2603.12247 • Published about 1 month ago • 23
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training Paper • 2603.12255 • Published about 1 month ago • 91
EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models Paper • 2603.12252 • Published about 1 month ago • 12
Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and Generation Paper • 2603.12247 • Published about 1 month ago • 23
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing Paper • 2602.12205 • Published Feb 12 • 80
ARM-Thinker Collection [CVPR2026] Official Implementation of "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning" • 1 item • Updated Feb 26 • 1
UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing Paper • 2602.02437 • Published Feb 2 • 80
SmartSearch: Process Reward-Guided Query Refinement for Search Agents Paper • 2601.04888 • Published Jan 8 • 10