simpleqa_verified-sample

Sleeping

dvilasuero commited on Nov 20, 2025

Commit

852e47b

verified ·

1 Parent(s): 1af51ae

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-title: Simpleqa Verified Sample
 emoji: 📊
 colorFrom: blue
 colorTo: purple
@@ -8,16 +8,38 @@ sdk_version: "latest"
 pinned: false
 ---
-# Simpleqa Verified Sample
-Live log viewer for eval results stored in [dvilasuero/simpleqa_verified-sample](https://huggingface.co/dvilasuero/simpleqa_verified-sample).
-This Space runs `inspect view` to display real-time evaluation logs from the dataset.
-## View Logs
-Logs are automatically displayed from: `hf://datasets/dvilasuero/simpleqa_verified-sample/logs`
-## Dataset
-Results are stored in: [dvilasuero/simpleqa_verified-sample](https://huggingface.co/dvilasuero/simpleqa_verified-sample)

 ---
+title: Simpleqa Verified Custom
 emoji: 📊
 colorFrom: blue
 colorTo: purple
 pinned: false
 ---
+# simpleqa_verified_custom
+This eval was run using [evaljobs](https://github.com/dvsrepo/evaljobs).
+## Command
+```bash
+evaljobs examples/simpleqa_verified_custom.py \
+  --model hf-inference-providers/openai/gpt-oss-20b:cheapest \
+  --name simpleqa_verified-sample \
+  --limit 10
+```
+## Run with other models
+To run this eval with a different model, use:
+```bash
+evaljobs https://huggingface.co/spaces/dvilasuero/simpleqa_verified-sample \
+  --model <your-model> \
+  --name <your-name> \
+  --flavor cpu-basic
+```
+## Inspect eval command
+The eval was executed with:
+```bash
+inspect eval eval.py \
+  --model hf-inference-providers/openai/gpt-oss-20b:cheapest \
+  --limit 10 \
+  --log-shared \
+  --log-buffer 100
+```