Collection of models and datasets for Beyond Binary Rewards: Training LMs to Reason about their Uncertainty
Mehul Damani PRO
mehuldamani
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
updated a dataset 35 minutes ago
mehuldamani/countdown_code_RLVR published a dataset 35 minutes ago
mehuldamani/countdown_code_RLVR updated a dataset 40 minutes ago
mehuldamani/countdown_code_fullOrganizations
None yet