RLHF Trojan Competition
Collection
Datasets and models used for the trojan detection competition co-located at SaTML 2024: https://github.com/ethz-spylab/rlhf_trojan_competition • 20 items • Updated
• 4
This reward model was used to align this generation model for the trojan detection competition co-located at SaTML 2024. For more information, visit the official competition website