Rui Sun's picture

Rui Sun

ThreeSR

·

https://threesr.github.io/

AI & ML interests

Vision and Language Multimodal Learning, CV, NLP, LLM

Recent Activity

upvoted a paper 3 days ago

Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation

upvoted a paper 4 days ago

Qwen-AgentWorld: Language World Models for General Agents

updated a collection 8 days ago

View all activity

Organizations

upvoted a paper 3 days ago

Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation

Paper • 2606.26907 • Published 4 days ago • 41

upvoted a paper 4 days ago

Qwen-AgentWorld: Language World Models for General Agents

Paper • 2606.24597 • Published 6 days ago • 137

upvoted 3 papers 8 days ago

DeNovoSWE: Scaling Long-Horizon Environments for Generating Entire Repositories from Scratch

Paper • 2606.10728 • Published 20 days ago • 34

Beyond Scalar Rewards by Internalizing Reasoning into Score Distributions

Paper • 2606.09076 • Published 21 days ago • 63

Agents' Last Exam

Paper • 2606.05405 • Published 26 days ago • 366

upvoted 2 papers 23 days ago

OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents

Paper • 2606.02031 • Published 28 days ago • 20

PaddleOCR-VL-1.6: Expanding the Frontier of Document Parsing with Under-Optimized Region Refinement and Progressive Post-Training

Paper • 2606.03264 • Published 27 days ago • 23

upvoted 4 papers about 1 month ago

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published May 12 • 194

HyperEyes: Dual-Grained Efficiency-Aware Reinforcement Learning for Parallel Multimodal Search Agents

Paper • 2605.07177 • Published May 8 • 63

Teaching Language Models to Think in Code

Paper • 2605.07237 • Published May 11 • 31

From Web to Pixels: Bringing Agentic Search into Visual Perception

Paper • 2605.12497 • Published May 12 • 14

upvoted 3 papers about 2 months ago

Image Generators are Generalist Vision Learners

Paper • 2604.20329 • Published Apr 22 • 22

Co-Director: Agentic Generative Video Storytelling

Paper • 2604.24842 • Published Apr 27 • 16

AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery

Paper • 2604.25256 • Published Apr 28 • 30

upvoted 2 papers 2 months ago

Mind DeepResearch Technical Report

Paper • 2604.14518 • Published Apr 17 • 23

DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation

Paper • 2604.14683 • Published Apr 16 • 36

upvoted 4 papers 3 months ago

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published Apr 9 • 248

OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence

Paper • 2604.07296 • Published Apr 8 • 40

MolmoWeb: Open Visual Web Agent and Open Data for the Open Web

Paper • 2604.08516 • Published Apr 9 • 47

Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models

Paper • 2604.08545 • Published Apr 9 • 41