Zhixiong Zhang (SII)'s picture

Zhixiong Zhang (SII)

rookiexiong

·

rookiexiong7

AI & ML interests

SJTU & SII Ph.D. Student, SII is an institution dedicated to innovation in education and research in the field of AI.

Recent Activity

authored a paper 3 days ago

LoMo: Local Modality Substitution for Deeper Vision-Language Fusion

upvoted a paper 3 days ago

LoMo: Local Modality Substitution for Deeper Vision-Language Fusion

submitted a paper 3 days ago

LoMo: Local Modality Substitution for Deeper Vision-Language Fusion

View all activity

Organizations

authored a paper 3 days ago

LoMo: Local Modality Substitution for Deeper Vision-Language Fusion

Paper • 2605.30265 • Published 5 days ago • 20

submitted a paper to Daily Papers 3 days ago

LoMo: Local Modality Substitution for Deeper Vision-Language Fusion

Paper • 2605.30265 • Published 5 days ago • 20

authored 5 papers 4 days ago

LSVOS 2025 Challenge Report: Recent Advances in Complex Video Object Segmentation

Paper • 2510.11063 • Published Oct 13, 2025 • 1

UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing

Paper • 2602.02437 • Published Feb 2 • 80

DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing

Paper • 2602.12205 • Published Feb 12 • 83

WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation

Paper • 2605.10912 • Published 22 days ago • 46

SetCon: Towards Open-Ended Referring Segmentation via Set-Level Concept Prediction

Paper • 2605.20110 • Published 14 days ago • 3

authored a paper 9 months ago

CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

Paper • 2508.20096 • Published Aug 27, 2025 • 37

authored 3 papers 10 months ago

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Paper • 2502.13128 • Published Feb 18, 2025 • 41

GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models

Paper • 2501.01428 • Published Jan 2, 2025

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction

Paper • 2507.15852 • Published Jul 21, 2025 • 38