Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
OpenEvals
community
Activity Feed
Follow
158
AI & ML interests
LLM evaluation
Recent Activity
clefourrier
authored
a paper
about 9 hours ago
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections
SaylorTwift
new
activity
4 days ago
OpenEvals/README:
New Benchmark Dataset
SaylorTwift
new
activity
about 1 month ago
OpenEvals/README:
Community Evals Feedback
View all activity
Team members
8
OpenEvals
's datasets
4
Sort: Recently updated
OpenEvals/IMO-AnswerBench
Viewer
•
Updated
Jan 23
•
400
•
161
•
1
OpenEvals/MuSR
Viewer
•
Updated
Dec 12, 2025
•
756
•
45
OpenEvals/aime_24
Viewer
•
Updated
Dec 12, 2025
•
30
•
79
•
1
OpenEvals/SimpleQA
Viewer
•
Updated
Dec 12, 2025
•
4.33k
•
1.01k
•
4