arxiv:2603.20538
Tengyang Xie
tengyangx
AI & ML interests
None yet
Recent Activity
authored a paper 2 days ago
Direct Nash Optimization: Teaching Language Models to Self-Improve with
General Preferences authored a paper 2 days ago
Exploratory Preference Optimization: Harnessing Implicit
Q*-Approximation for Sample-Efficient RLHF authored a paper 2 days ago
Interpretable Preferences via Multi-Objective Reward Modeling and
Mixture-of-Experts