Spaces:
Running
Running
| title: README | |
| emoji: 💻 | |
| colorFrom: pink | |
| colorTo: red | |
| sdk: static | |
| pinned: false | |
| license: mit | |
| short_description: 'ISAAC OS — Neural v1 (Agentic-Lite deterministic evaluation ' | |
| # ISAAC OS — Neural v1 (Deterministic Evaluation, Agentic-Lite) | |
| **Organization:** [Isaac-AI-OS](https://huggingface.co/Isaac-AI-OS) | |
| **Model ID:** [`isaac-20b`](https://huggingface.co/Isaac-AI-OS/isaac-20b) | |
| **Policy Version:** `agentic-lite-v1` | |
| **Artifacts Dataset:** [`isaac-20b-eval-artifacts`](https://huggingface.co/datasets/Isaac-AI-OS/isaac-20b-eval-artifacts) | |
| **Docker Digest:** `isaac-hf@sha256:6fc9f0d85dfe56daba8fc92496718226f056014b3e84ee7a823df1d9271a57c0` | |
| --- | |
| ISAAC is a **self-verifying neural operating system** designed for reproducible, auditable AI. | |
| The *Agentic-Lite* evaluation mode enforces deterministic sampling (`temperature=0`, `top_p=0`, `seed=7`) and code-only normalization, producing byte-identical artifacts across runs. | |
| ### 🔍 Current Subset Results | |
| | Benchmark | Split | Metric | Score | | |
| |------------|--------|---------|------:| | |
| | **HumanEval** | N=5 | pass@1 | **0.60** | | |
| | **MBPP** | N=5 | pass@1 | **0.80** | | |
| | **SWE-Bench Lite** | 1/1 instance resolved | resolved via fallback_dataset_patch | **1 / 1** | | |
| --- | |
| ### 🧩 Reproducibility | |
| - Deterministic “Agentic-Lite” mode: single plan, no concurrency, fixed seeds. | |
| - Evaluation artifacts (LM, Code, SWE) are all published for cross-verification. | |
| - Manifest pinned to Docker digest above for full audit trace. | |
| ### 📂 Artifacts | |
| - [LM normalized results](https://huggingface.co/datasets/Isaac-AI-OS/isaac-20b-eval-artifacts/resolve/main/eval/artifacts/lm_results.norm.json) | |
| - [Code benchmarks](https://huggingface.co/datasets/Isaac-AI-OS/isaac-20b-eval-artifacts/resolve/main/eval/artifacts/code/summary.json) | |
| - [SWE-Bench Lite results](https://huggingface.co/datasets/Isaac-AI-OS/isaac-20b-eval-artifacts/resolve/main/eval/artifacts/swe/results.json) | |
| - [Manifest](https://huggingface.co/datasets/Isaac-AI-OS/isaac-20b-eval-artifacts/resolve/main/eval/artifacts/manifest.json) | |
| ### 💡 Roadmap | |
| - Add `logprobs` to `/v1/completions` → full MC reasoning (MMLU/ARC/HellaSwag). | |
| - Enable kernel-level determinism for multi-node HA. | |
| - Publish “Agentic Swarm” uplift appendix once replay bundles are live. |