Add GSM8K evaluation result

#112

by burtenshaw HF Staff - opened Jan 16

base: refs/heads/main

←

from: refs/pr/112

Discussion Files changed

-0

burtenshaw

Jan 16

Evaluation Results

This PR adds structured evaluation results using the new .eval_results/ format.

What This Enables

Model Page: Results appear on the model page with benchmark links
Leaderboards: Scores are aggregated into benchmark dataset leaderboards
Verification: Support for cryptographic verification of evaluation runs

Format Details

Results are stored as YAML in .eval_results/ folder. See the Eval Results Documentation for the full specification.

Generated by community-evals

Add GSM8K evaluation result303c8205

xujfcn

16 days ago

Great discussion! For anyone wanting to quickly test this, Crazyrouter offers API access to this model. No infrastructure setup needed — just an API key and the standard OpenAI SDK.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment