Chenming Zhu's picture

1 15 3

Chenming Zhu

ChaimZhu

·

https://zcmax.github.io/

AI & ML interests

Multimodal Large Language Models, 3D Perception and Understanding, Embodied AI

Recent Activity

upvoted a paper 9 days ago

G^2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

updated a model 12 days ago

InternRobotics/InternVLA-N1

upvoted a paper 2 months ago

SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer

View all activity

Organizations

authored a paper 5 months ago

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

Paper • 2507.07984 • Published Jul 10 • 42

authored a paper about 1 year ago

LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness

Paper • 2409.18125 • Published Sep 26, 2024 • 34