Lingdong Kong PRO

ldkong

·

https://ldkong.com

AI & ML interests

3D Perception, Generation, and World Modeling

Recent Activity

authored a paper 9 days ago

Watch, Remember, Reason: Human-View Video Understanding with MLLMs

authored a paper 9 days ago

DanceOPD: On-Policy Generative Field Distillation

commentedon a paper 10 days ago

DanceOPD: On-Policy Generative Field Distillation

View all activity

Organizations

upvoted a paper 10 days ago

DanceOPD: On-Policy Generative Field Distillation

Paper • 2606.27377 • Published 11 days ago • 81

upvoted a paper 27 days ago

Watch, Remember, Reason: Human-View Video Understanding with MLLMs

Paper • 2606.07433 • Published Jun 5 • 21

upvoted a paper about 2 months ago

AI for Auto-Research: Roadmap & User Guide

Paper • 2605.18661 • Published May 18 • 69

upvoted a paper 2 months ago

Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

Paper • 2604.22748 • Published Apr 24 • 231

upvoted a paper 3 months ago

OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation

Paper • 2604.18486 • Published Apr 20 • 96

upvoted 2 papers 7 months ago

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

Paper • 2512.16760 • Published Dec 18, 2025 • 15

U4D: Uncertainty-Aware 4D World Modeling from LiDAR Sequences

Paper • 2512.02982 • Published Dec 2, 2025 • 3

upvoted 4 papers 8 months ago

Scaling Spatial Intelligence with Multimodal Foundation Models

Paper • 2511.13719 • Published Nov 17, 2025 • 50

MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

Paper • 2511.11793 • Published Nov 14, 2025 • 197

3EED: Ground Everything Everywhere in 3D

Paper • 2511.01755 • Published Nov 3, 2025 • 11

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations

Paper • 2510.23607 • Published Oct 27, 2025 • 181

upvoted 3 papers 9 months ago

VideoLucy: Deep Memory Backtracking for Long Video Understanding

Paper • 2510.12422 • Published Oct 14, 2025 • 1

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

Paper • 2510.20579 • Published Oct 23, 2025 • 56

RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning

Paper • 2510.02240 • Published Oct 2, 2025 • 18

upvoted 6 papers 10 months ago

UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase

Paper • 2309.05573 • Published Sep 11, 2023 • 2

The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation

Paper • 2307.15061 • Published Jul 27, 2023 • 1

4D Contrastive Superflows are Dense 3D Representation Learners

Paper • 2407.06190 • Published Jul 8, 2024 • 1

Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective

Paper • 2208.07365 • Published Aug 15, 2022 • 1

LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving

Paper • 2501.04005 • Published Jan 7, 2025 • 1

FRNet: Frustum-Range Networks for Scalable LiDAR Segmentation

Paper • 2312.04484 • Published Dec 7, 2023 • 1