README / README.md
petermant's picture
Update README.md
98cddd7 verified
metadata
title: README
emoji: 💻
colorFrom: pink
colorTo: red
sdk: static
pinned: false
license: mit
short_description: 'ISAAC OS — Neural v1 (Agentic-Lite deterministic evaluation '

ISAAC OS — Neural v1 (Deterministic Evaluation, Agentic-Lite)

Organization: Isaac-AI-OS
Model ID: isaac-20b
Policy Version: agentic-lite-v1
Artifacts Dataset: isaac-20b-eval-artifacts
Docker Digest: isaac-hf@sha256:6fc9f0d85dfe56daba8fc92496718226f056014b3e84ee7a823df1d9271a57c0


ISAAC is a self-verifying neural operating system designed for reproducible, auditable AI.
The Agentic-Lite evaluation mode enforces deterministic sampling (temperature=0, top_p=0, seed=7) and code-only normalization, producing byte-identical artifacts across runs.

🔍 Current Subset Results

Benchmark Split Metric Score
HumanEval N=5 pass@1 0.60
MBPP N=5 pass@1 0.80
SWE-Bench Lite 1/1 instance resolved resolved via fallback_dataset_patch 1 / 1

🧩 Reproducibility

  • Deterministic “Agentic-Lite” mode: single plan, no concurrency, fixed seeds.
  • Evaluation artifacts (LM, Code, SWE) are all published for cross-verification.
  • Manifest pinned to Docker digest above for full audit trace.

📂 Artifacts

💡 Roadmap

  • Add logprobs to /v1/completions → full MC reasoning (MMLU/ARC/HellaSwag).
  • Enable kernel-level determinism for multi-node HA.
  • Publish “Agentic Swarm” uplift appendix once replay bundles are live.