1 17 1

AnIdealRing

SmartDazi

AI & ML interests

None yet

Recent Activity

upvoted a paper 24 days ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

upvoted a paper 26 days ago

AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents

liked a dataset 28 days ago

LulaCola/AgentProcessBench

View all activity

Organizations

upvoted a paper 24 days ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published 26 days ago • 137

upvoted a paper 26 days ago

AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents

Paper • 2603.14465 • Published 28 days ago • 23

liked a dataset 28 days ago

LulaCola/AgentProcessBench

Viewer • Updated 26 days ago • 1k • 384 • 14

upvoted a paper about 1 month ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 57

upvoted a paper about 2 months ago

Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation

Paper • 2602.12125 • Published Feb 12 • 62

upvoted 2 papers 2 months ago

Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models

Paper • 2602.04649 • Published Feb 4 • 12

RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents

Paper • 2602.02486 • Published Feb 2 • 20

upvoted 2 papers 3 months ago

InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning

Paper • 2601.14209 • Published Jan 20 • 6

DARC: Decoupled Asymmetric Reasoning Curriculum for LLM Evolution

Paper • 2601.13761 • Published Jan 20 • 16

updated 2 models 3 months ago

openbmb/AgentCPM-Explore-GGUF

4B • Updated Jan 17 • 206 • 22

openbmb/AgentCPM-Explore

Text Generation • 4B • Updated Jan 18 • 144 • 328

upvoted 2 papers 4 months ago

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published Dec 11, 2025 • 47

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Paper • 2512.07461 • Published Dec 8, 2025 • 79

upvoted a paper 6 months ago

LaSeR: Reinforcement Learning with Last-Token Self-Rewarding

Paper • 2510.14943 • Published Oct 16, 2025 • 40

upvoted a paper 8 months ago

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published Aug 7, 2025 • 131

upvoted 2 papers 9 months ago

WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models

Paper • 2411.05451 • Published Nov 8, 2024 • 2

WebSailor: Navigating Super-human Reasoning for Web Agent

Paper • 2507.02592 • Published Jul 3, 2025 • 126

upvoted 2 papers 10 months ago

Self-Challenging Language Model Agents

Paper • 2506.01716 • Published Jun 2, 2025 • 10

MiniCPM4: Ultra-Efficient LLMs on End Devices

Paper • 2506.07900 • Published Jun 9, 2025 • 96

updated a model 10 months ago

openbmb/MiniCPM4-MCP

Text Generation • 8B • Updated Jun 10, 2025 • 41 • 35

AnIdealRing

AI & ML interests

Recent Activity

Organizations

SmartDazi's activity