occ-stack / README.md
narcolepticchicken's picture
Update ML Intern artifact metadata
78c189b verified
---
tags:
- ml-intern
---
# Oracle-Credit-Compute (OCC) Stack
A minimal, open-source research prototype for **agentic compute allocation** where agents earn and spend non-transferable, decaying credits based on verified marginal impact.
## Quickstart
```bash
git clone https://huggingface.co/narcolepticchicken/occ-stack
cd occ-stack
pip install -r requirements.txt
# Simulated benchmarks (CPU)
python benchmarks/benchmark_code.py # Code compute allocation
python benchmarks/benchmark_retrieval_qa.py # Retrieval QA
python benchmarks/benchmark_debate_v2.py # Multi-agent debate
# Ablations + anti-gaming (CPU, ~5 min)
python eval_runner.py
# Real LLM benchmark (GPU, requires T4+)
python jobs/run_real_llm_standalone_v7.py
# Unit tests
python tests/test_oracle.py
python tests/test_ledger.py
```
## Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Agent │───▢│ ResourceBroker │───▢│ Compute β”‚
β”‚ (requests β”‚ β”‚ (allow/deny/ β”‚ β”‚ (model call,β”‚
β”‚ resource) │◄───│ downgrade) │◄───│ retrieval) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚
β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ CreditLedger│◄───│ ImpactOracle β”‚
β”‚ (earn/spend/β”‚ β”‚ (score action β”‚
β”‚ decay) β”‚ β”‚ on verified β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ impact) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## Key Results (Simulated)
- **52.3% compute reduction at iso-accuracy** on code benchmark (OCC tiered escalation vs fixed budget)
- **76% accuracy with 40% adversarial agents** in debate (OCC credit-filtering vs 56% naive confidence voting)
- **All anti-gaming attacks contained:** hidden-test gaming, collusion, over-abstention, spam
## Status
| Component | Status |
|-----------|--------|
| Impact Oracle | βœ… Working |
| Credit Ledger | βœ… Working |
| Resource Broker | βœ… Working |
| GRPO/RL Hook | βœ… Factory ready |
| Simulated benchmarks | βœ… Complete |
| Ablations (10 conditions) | βœ… Complete |
| Anti-gaming tests | βœ… Complete |
| Real LLM benchmark | πŸ”„ V7 in progress |
| GRPO training | πŸ”„ Not yet run |
## Repo Structure
```
occ/
oracle/ # ImpactOracle β€” rule-based scoring
ledger/ # CreditLedger β€” non-transferable, decaying credits
broker/ # ResourceBroker β€” capability-based access control
rl/ # RewardHook, OfflineComparator β€” TRL GRPO integration
benchmarks/ # 3 benchmark scripts + real LLM variants
tests/ # Unit tests
reports/ # Reports, results, blog post
jobs/ # Self-contained GPU job scripts
```
## Citation
```bibtex
@misc{occ2026,
title={Oracle-Credit-Compute: A Minimal Stack for Agentic Compute Allocation},
author={narcolepticchicken},
year={2026},
url={https://huggingface.co/narcolepticchicken/occ-stack}
}
```
<!-- ml-intern-provenance -->
## Generated by ML Intern
This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub.
- Try ML Intern: https://smolagents-ml-intern.hf.space
- Source code: https://github.com/huggingface/ml-intern
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = 'narcolepticchicken/occ-stack'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
```
For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.