15 20 4

Jiawei Wang

Jarvis1111

https://jarvisustc.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

upvoted a paper 7 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

upvoted a paper 20 days ago

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

View all activity

Organizations

None yet

upvoted a paper 4 days ago

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

Paper • 2512.04324 • Published 5 days ago • 139

upvoted a paper 7 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published 8 days ago • 82

upvoted a paper 20 days ago

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

Paper • 2511.08577 • Published 27 days ago • 104

upvoted a paper 2 months ago

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28 • 173

upvoted 5 papers 3 months ago

Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents

Paper • 2509.09265 • Published Sep 11 • 46

Reverse-Engineered Reasoning for Open-Ended Generation

Paper • 2509.06160 • Published Sep 7 • 149

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4 • 193

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published Aug 24 • 80

upvoted 4 papers 4 months ago

WideSearch: Benchmarking Agentic Broad Info-Seeking

Paper • 2508.07999 • Published Aug 11 • 110

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7 • 180

Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4 • 263

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

Paper • 2507.21046 • Published Jul 28 • 82

upvoted a paper 5 months ago

4KAgent: Agentic Any Image to 4K Super-Resolution

Paper • 2507.07105 • Published Jul 9 • 105

upvoted a paper 7 months ago

DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue

Paper • 2505.19630 • Published May 26 • 7

upvoted an article 7 months ago

Article

Our Transformers Code Agent beats the GAIA benchmark 🏅

Jul 1, 2024

•

upvoted 3 papers 8 months ago

Inference-Time Scaling for Generalist Reward Modeling

Paper • 2504.02495 • Published Apr 3 • 57

Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21 • 88

Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks

Paper • 2504.01308 • Published Apr 2 • 14

upvoted a paper 9 months ago

UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure Analysis

Paper • 2503.15893 • Published Mar 20 • 2

Jiawei Wang

AI & ML interests

Recent Activity

Organizations

Jarvis1111's activity

Our Transformers Code Agent beats the GAIA benchmark 🏅