Paper: A UNIVERSITY-LEVEL BENCHMARK FOR EVALUATING MATHEMATICAL SKILLS IN LLMS
Toloka
company
Verified
AI & ML interests
Human-expert data for frontier reasoning, safety and agentic AI
Recent Activity
Organization Card
Hey, this is Toloka!
datasets 13
toloka/HomER
Viewer
• Updated
• 63
toloka/mu-math
Viewer
• Updated
• 1.08k • 45 • 24
toloka/u-math
Viewer
• Updated
• 1.1k • 171 • 26
toloka/vist
Viewer
• Updated
• 39.3k • 19
toloka/VOX-DUB
Viewer
• Updated
• 7.58k • 82 • 10
toloka/JEEM
Viewer
• Updated
• 2.2k • 78 • 14
toloka/beemo
Viewer
• Updated
• 2.19k • 372 • 19
toloka/CLESC
Viewer
• Updated
• 500 • 17 • 2
toloka/VoxDIY-RusNews
Updated
• 61 • 3
toloka/CrowdSpeech
Updated
• 117 • 5