The tasks and counterfactuals from the Mechanistic Interpretability Benchmark.
AI & ML interests
Principled evaluation of mechanistic interpretability methods.
Recent Activity
View all activity
datasets 7
mib-bench/ravel
Viewer
• Updated
• 117k • 14
mib-bench/arithmetic_subtraction
Viewer
• Updated
• 20.9k • 164
mib-bench/arithmetic_addition
Viewer
• Updated
• 40.4k • 211
mib-bench/ioi
Viewer
• Updated
• 21k • 619
mib-bench/arc_easy
Viewer
• Updated
• 4.01k • 381
mib-bench/arc_challenge
Viewer
• Updated
• 2k • 269
mib-bench/copycolors_mcqa
Viewer
• Updated
• 1.89k • 239