-
Multimodal Clembench
🏆3Explore and compare multimodal models with interactive leaderboards and plots
-
SEED-Bench Leaderboard
🏆85Submit model evaluation results to leaderboard
-
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Paper • 2311.16502 • Published • 37 -
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
Paper • 2409.02813 • Published • 33
Collections
Discover the best community collections!
Collections including paper arxiv:2408.03361
-
Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models
Paper • 2408.04594 • Published • 14 -
Achieving Human Level Competitive Robot Table Tennis
Paper • 2408.03906 • Published • 28 -
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI
Paper • 2408.03361 • Published • 85 -
Heavy Labels Out! Dataset Distillation with Label Space Lightening
Paper • 2408.08201 • Published • 21
-
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
Paper • 2404.16790 • Published • 10 -
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
Paper • 2406.08407 • Published • 28 -
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI
Paper • 2408.03361 • Published • 85
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 23 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 30 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 132 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 41
-
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 244 -
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Paper • 2311.16502 • Published • 37 -
BLINK: Multimodal Large Language Models Can See but Not Perceive
Paper • 2404.12390 • Published • 26 -
RULER: What's the Real Context Size of Your Long-Context Language Models?
Paper • 2404.06654 • Published • 39
-
Multimodal Clembench
🏆3Explore and compare multimodal models with interactive leaderboards and plots
-
SEED-Bench Leaderboard
🏆85Submit model evaluation results to leaderboard
-
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Paper • 2311.16502 • Published • 37 -
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
Paper • 2409.02813 • Published • 33
-
Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models
Paper • 2408.04594 • Published • 14 -
Achieving Human Level Competitive Robot Table Tennis
Paper • 2408.03906 • Published • 28 -
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI
Paper • 2408.03361 • Published • 85 -
Heavy Labels Out! Dataset Distillation with Label Space Lightening
Paper • 2408.08201 • Published • 21
-
Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Paper • 2405.08748 • Published • 23 -
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Paper • 2405.10300 • Published • 30 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 132 -
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Paper • 2405.11143 • Published • 41
-
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
Paper • 2404.16790 • Published • 10 -
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
Paper • 2406.08407 • Published • 28 -
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI
Paper • 2408.03361 • Published • 85
-
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 244 -
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Paper • 2311.16502 • Published • 37 -
BLINK: Multimodal Large Language Models Can See but Not Perceive
Paper • 2404.12390 • Published • 26 -
RULER: What's the Real Context Size of Your Long-Context Language Models?
Paper • 2404.06654 • Published • 39