Fangyuan Yu's picture

Fangyuan Yu PRO

Ksgk-fy

·

fangyuan-ksgk

AI & ML interests

AGI

Recent Activity

updated a collection about 1 month ago

Representation & Optimization

upvoted a paper about 1 month ago

Scaling Latent Reasoning via Looped Language Models

upvoted a paper about 1 month ago

Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

View all activity

Organizations

upvoted 2 papers about 1 month ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published Oct 29 • 220

Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

Paper • 2510.25992 • Published Oct 29 • 45

upvoted 8 papers 2 months ago

Memory Retrieval and Consolidation in Large Language Models through Function Tokens

Paper • 2510.08203 • Published Oct 9 • 9

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

LongLive: Real-time Interactive Long Video Generation

Paper • 2509.22622 • Published Sep 26 • 184

The Three Regimes of Offline-to-Online Reinforcement Learning

Paper • 2510.01460 • Published Oct 1 • 1

RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems

Paper • 2510.02263 • Published Oct 2 • 8

RLP: Reinforcement as a Pretraining Objective

Paper • 2510.01265 • Published Sep 26 • 40

Generalized Parallel Scaling with Interdependent Generations

Paper • 2510.01143 • Published Oct 1 • 4

Mem-α: Learning Memory Construction via Reinforcement Learning

Paper • 2509.25911 • Published Sep 30 • 14

upvoted 10 papers 3 months ago

Scalable Reinforcement Post-Training Beyond Static Human Prompts: Evolving Alignment via Asymmetric Self-Play

Paper • 2411.00062 • Published Oct 31, 2024 • 1

Distilled Pretraining: A modern lens of Data, In-Context Learning and Test-Time Scaling

Paper • 2509.01649 • Published Sep 1 • 2

LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities

Paper • 2504.16078 • Published Apr 22 • 21

RL's Razor: Why Online Reinforcement Learning Forgets Less

Paper • 2509.04259 • Published Sep 4 • 6

BED-LLM: Intelligent Information Gathering with LLMs and Bayesian Experimental Design

Paper • 2508.21184 • Published Aug 28 • 2

Reinforcement Learning for Machine Learning Engineering Agents

Paper • 2509.01684 • Published Sep 1 • 1

Mixture of Contexts for Long Video Generation

Paper • 2508.21058 • Published Aug 28 • 35

Social World Models

Paper • 2509.00559 • Published Aug 30 • 1

Supporting Our AI Overlords: Redesigning Data Systems to be Agent-First

Paper • 2509.00997 • Published Aug 31 • 2

Differentiable Entropy Regularization for Geometry and Neural Networks

Paper • 2509.03733 • Published Sep 3 • 1