File size: 1,355 Bytes
a23ae7b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
---
license: apache-2.0
tags: [evaluation, reproducibility, agentic-lite, code, swe-bench]
pipeline_tag: text-generation
---
# ISAAC OS — Neural v1 (Deterministic Eval, Agentic-Lite)
**Model ID:** `isaac-20b`
**Policy Version:** `agentic-lite-v1`
**Docker Digest:** `isaac-hf@sha256:6fc9f0d85dfe56daba8fc92496718226f056014b3e84ee7a823df1d9271a57c0`
## Results (subset scale)
| Benchmark | Split | Metric | Score |
|---|---|---:|---:|
| HumanEval | N=5 | pass@1 | **0.60** |
| MBPP | N=5 | pass@1 | **0.80** |
| SWE-Bench Lite | 1/1 resolved | model pass@1 | — |
| | | resolved via **fallback_dataset_patch** | **1 / 1** |
## Reproducibility
Agentic-Lite clamps (temperature=0, top_p=0, top_k=1, n=1, seed=7), deterministic tools (no concurrency, max_steps=6), first-line QA & code-only normalization; one-node eval.
## Artifacts & Manifest
- LM: https://huggingface.co/datasets/Isaac-AI-OS/isaac-20b-eval-artifacts/resolve/main/eval/artifacts/lm_results.norm.json
- Code summary: https://huggingface.co/datasets/Isaac-AI-OS/isaac-20b-eval-artifacts/resolve/main/eval/artifacts/code/summary.json
- SWE-Lite: https://huggingface.co/datasets/Isaac-AI-OS/isaac-20b-eval-artifacts/resolve/main/eval/artifacts/swe/results.json
- Manifest: https://huggingface.co/datasets/Isaac-AI-OS/isaac-20b-eval-artifacts/resolve/main/eval/artifacts/manifest.json
|