Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
GEM benchmark
https://gem-benchmark.com
Activity Feed
Request to join this org
Follow
121
AI & ML interests
We develop infrastructure for the evaluation of generated text.
Recent Activity
fladhak
authored
a paper
2 days ago
SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL
yjernite
authored
a paper
4 months ago
The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources
yjernite
authored
a paper
4 months ago
In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI
View all activity
Team members
96
+62
+49
+28
+18
GEM
's models
None public yet