Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
qqliangqi
's Collections
Cite
agent
planning
IFT
RLHF
sft
pre-train
some benchmark
some benchmark
updated
Oct 28, 2025
Upvote
-
cais/mmlu
Viewer
•
Updated
Mar 8, 2024
•
231k
•
297k
•
632
TIGER-Lab/MMLU-Pro
Benchmark
•
Updated
11 days ago
•
12.1k
•
84.9k
•
412
cais/hle
Benchmark
•
Updated
9 days ago
•
2.5k
•
20.5k
•
671
m-a-p/SuperGPQA
Viewer
•
Updated
Apr 30, 2025
•
26.5k
•
4.92k
•
80
lmarena-ai/arena-hard-auto
Updated
May 1, 2025
•
203
•
6
Running
202
MT Bench
📊
202
Compare AI model responses side-by-side
Upvote
-
Share collection
View history
Collection guide
Browse collections