codelion
/

sprog-9m

+- dataset:
+    id: openai/gsm8k
+    task_id: gsm8k
+  value: 11.8
+  source:
+    url: https://huggingface.co/codelion/sprog-9m
+    name: Symbolic verifier, 96-sample self-consistency (single committed answer)
+    user: codelion
+  notes: >-
+    Single-answer exact-match accuracy on the full GSM8K test set (1319 problems),
+    mean of 3 training seeds (range 11.1-12.6%). Inference: 96 temperature samples
+    per question, a 0-parameter symbolic verifier selects one committed answer via
+    self-consistency. Custom symbolic-program harness, not inspect-ai
+    model_graded_fact. 9.37M-param encoder-decoder, trained from scratch.