Haolin Liu's picture

16

Haolin Liu

lhl616

AI & ML interests

None yet

Recent Activity

upvoted a paper 21 days ago

RelayLLM: Efficient Reasoning via Collaborative Decoding

upvoted a paper 22 days ago

Benchmark^2: Systematic Evaluation of LLM Benchmarks

upvoted a paper about 1 month ago

Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning

View all activity

Organizations

None yet

upvoted a paper 21 days ago

RelayLLM: Efficient Reasoning via Collaborative Decoding

Paper • 2601.05167 • Published 21 days ago • 29

upvoted a paper 22 days ago

Benchmark^2: Systematic Evaluation of LLM Benchmarks

Paper • 2601.03986 • Published 22 days ago • 34

upvoted a paper about 1 month ago

Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning

Paper • 2512.15687 • Published Dec 17, 2025 • 20

upvoted a paper about 2 months ago

MotionEdit: Benchmarking and Learning Motion-Centric Image Editing

Paper • 2512.10284 • Published Dec 11, 2025 • 26

upvoted a paper 3 months ago

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

Paper • 2510.20187 • Published Oct 23, 2025 • 19

upvoted 4 papers 4 months ago

VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning

Paper • 2510.01444 • Published Oct 1, 2025 • 20

CLUE: Non-parametric Verification from Experience via Hidden-State Clustering

Paper • 2510.01591 • Published Oct 2, 2025 • 28

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Paper • 2509.25760 • Published Sep 30, 2025 • 55

Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation

Paper • 2509.15194 • Published Sep 18, 2025 • 33

upvoted 4 papers 5 months ago

CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models

Paper • 2509.09675 • Published Sep 11, 2025 • 28

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9, 2025 • 102

Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Paper • 2509.03403 • Published Sep 3, 2025 • 23

Self-Rewarding Vision-Language Model via Reasoning Decomposition

Paper • 2508.19652 • Published Aug 27, 2025 • 84

upvoted 2 papers 6 months ago

Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback

Paper • 2310.11550 • Published Oct 17, 2023 • 1

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published Aug 7, 2025 • 130

upvoted a paper 7 months ago

One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11, 2025 • 32