Anchor Forcing: Anchor Memory and Tri-Region RoPE for Interactive Streaming Video Diffusion Paper ⢠2603.13405 ⢠Published 28 days ago
SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing Paper ⢠2604.04911 ⢠Published 3 days ago ⢠31
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper ⢠2604.04921 ⢠Published 3 days ago ⢠82
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper ⢠2604.04921 ⢠Published 3 days ago ⢠82
Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing Paper ⢠2603.12254 ⢠Published 27 days ago ⢠21
Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing Paper ⢠2603.12254 ⢠Published 27 days ago ⢠21
SANA-Video Collection š¬ SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer ⢠10 items ⢠Updated 24 days ago ⢠7
3D Aware Region Prompted Vision Language Model Paper ⢠2509.13317 ⢠Published Sep 16, 2025 ⢠14
ArcMemo: Abstract Reasoning Composition with Lifelong LLM Memory Paper ⢠2509.04439 ⢠Published Sep 4, 2025 ⢠1
KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems Paper ⢠2510.12872 ⢠Published Oct 14, 2025 ⢠4
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper ⢠2510.15870 ⢠Published Oct 17, 2025 ⢠92
Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail Paper ⢠2511.00088 ⢠Published Oct 30, 2025 ⢠4
SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference Paper ⢠2510.17777 ⢠Published Oct 20, 2025 ⢠1