Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Mechanist Interpretability for Alignment Algorithms
community
Activity Feed
Follow
5
AI & ML interests
AI Safety, Mechanist Interpretability
Recent Activity
ishangarg183
updated
a dataset
about 1 hour ago
MInAlA/crosscoder-multilayer-split-activations
ishangarg183
published
a dataset
1 day ago
MInAlA/crosscoder-multilayer-split-activations
ishangarg183
updated
a dataset
6 days ago
MInAlA/crosscoder-smollm3-ppo
View all activity
Team members
5
models
18
Sort: Recently updated
MInAlA/Llama-3.2-3B-Instruct-PPO-merged
Text Generation
•
3B
•
Updated
11 days ago
•
259
MInAlA/SmolLM3-3B-PPO-merged
3B
•
Updated
12 days ago
•
31
MInAlA/Qwen3-4B-Instruct-2507-PPO-merged
Text Generation
•
4B
•
Updated
13 days ago
•
419
MInAlA/Llama-3.2-3B-SimPO-merged
Text Generation
•
3B
•
Updated
16 days ago
•
297
MInAlA/Qwen3-4B-Instruct-2507-SimPO-merged
Text Generation
•
4B
•
Updated
16 days ago
•
36
MInAlA/SmolLM3-3B-SimPO-merged
Text Generation
•
3B
•
Updated
16 days ago
•
25
MInAlA/Llama-3.2-3B-Instruct-GRPO-merged
Text Generation
•
3B
•
Updated
18 days ago
•
43
MInAlA/Qwen3-4B-Instruct-2507-GRPO-merged
Text Generation
•
4B
•
Updated
19 days ago
•
217
MInAlA/SmolLM3-3B-GRPO-merged
Text Generation
•
3B
•
Updated
22 days ago
•
29
MInAlA/Llama-3.2-3B-Instruct-KTO-merged
Text Generation
•
3B
•
Updated
22 days ago
•
269
View 18 models
datasets
5
Sort: Recently updated
MInAlA/crosscoder-multilayer-split-activations
Updated
1 minute ago
•
11
MInAlA/crosscoder-smollm3-ppo
Viewer
•
Updated
6 days ago
•
1
•
28
MInAlA/crosscoder-qwen3-4b-ppo
Viewer
•
Updated
6 days ago
•
1
•
25
MInAlA/crosscoder-llama32-3b-ppo
Viewer
•
Updated
6 days ago
•
1
•
29
MInAlA/medical-tampering-eval
Viewer
•
Updated
24 days ago
•
535
•
47