| # OCC: Oracle-Credit-Compute for Agentic Resource Allocation |
|
|
| ## Technical Report β May 2026 (v12 β CORRECTED FRAMING) |
|
|
| **Status:** Complete. Real-LLM validation on H200. Two-seed debate with mechanism isolation planned. AllenAI judge TruthfulQA scoring. |
|
|
| **Core claim (revised):** In multi-agent systems, compute is not neutral. Extra turns, tokens, and tool calls amplify adversarial influence unless access to deliberation is governed by verified marginal contribution. OCC is a mechanism-design layer that treats compute allocation as a security boundary β making agent compute scarce, earned, capability-scoped, decaying, and auditable. |
|
|
| --- |
|
|
| ## PART 0: WHY THIS MATTERS |
|
|
| Modern agent systems waste compute because they treat it as free speech. Every agent, tool call, debate turn, retrieval, and verifier pass is a potential attack surface β a channel through which unreliable, adversarial, or simply wrong agents can amplify their influence. |
|
|
| The current paradigm: |
| - Equal-turn debate β adversarial voice equals honest voice |
| - Unlimited retry β reward for persistence, not correctness |
| - No credit accounting β no incentive for efficiency |
| - No decay β stale trust persists forever |
|
|
| OCC changes the paradigm: **compute is a scarce privilege, not a right.** |
|
|
| --- |
|
|
| ## PART I: THE COLLAPSE β Evidence That Allocation Matters |
|
|
| ### Benchmark 1: Multi-Agent Debate Under Shared Compute |
|
|
| **Setup:** 30 debate topics, 4 agents (3 honest + 1 adversarial), Qwen3-Coder-30B-A3B-Instruct on H200. Global credit pool. 2 seeds (42, 123). |
|
|
| #### The Core Finding |
|
|
| | Condition | Accuracy | Tokens | |
| |-----------|----------|--------| |
| | **Equal 1-round** (baseline) | **88.3%** | 41.8k | |
| | **Equal 3-round** | **56.7%** | 149.8k | |
| | Random drop (25%) | 85.0% | 30.7k | |
| | OCC 180/3 | 83.3% | 41.0k | |
| | OCC 120/3 | 85.0% | 42.7k | |
|
|
| **The collapse is stark and replicable.** Both seeds produce exactly 17/30 = 56.7%. An adversarial agent given 3Γ the speaking time drags the group 32pp below baseline. Three times the compute produces performance worse than a coin flip. |
|
|
| #### What This Is NOT Saying |
|
|
| - **NOT**: "Debate is bad." Debate can surface truth. But debate without allocation control creates an exploitable communication channel. |
| - **NOT**: "Multi-agent systems are harmful." The collapse only occurs in adversarial contexts β it shows that compute allocation must be adversary-aware. |
| - **NOT**: "OCC solves everything." OCC prevents the collapse but does not outperform random gating at moderate budgets. |
|
|
| #### What This IS Saying |
|
|
| **Debate without allocation control amplifies adversarial influence.** Giving every agent equal turns is like giving every network packet equal bandwidth during a DDoS. The attacker's packets aren't better β there are just more of them, and they eventually overwhelm the honest signal. |
|
|
| OCC treats turns/tokens as scarce, auditable privileges rather than free speech coupons. |
|
|
| ### The Mechanism Question |
|
|
| Why does equal-3-round collapse? Several hypotheses (to be tested by the mechanism isolation experiment at `/jobs/occ_debate_collapse_mechanism.py`): |
|
|
| | Hypothesis | Test | Prediction | |
| |------------|------|------------| |
| | H1: Volume | Equal token, unequal turn budget | If collapse disappears, volume caused it | |
| | H2: Recency | Randomized speaking order | If collapse softens, last-speaker bias caused it | |
| | H3: Protocol | Judge-based voting instead of majority | If collapse disappears, majority voting is the vulnerability | |
| | H4: Contamination | Track honest agent answer retention | If honest agents flip toward adversary, contamination | |
| | H5: Entropy | Confidence-weighted voting | If collapse reverses, uncertainty not persuasion | |
| | H6: Prompt | Vary adversary skill (weak/normal/strong/oracle) | If only strong prompts collapse, prompt artifact | |
| | H7: Selection | Stratify by topic difficulty | If only some topics collapse, selection bias | |
|
|
| The mechanism isolation experiment produces: |
| - Round-by-round honest answer retention rates |
| - Adversary-induced flip counts |
| - Per-topic transition matrices (correctβcorrect, correctβwrong, wrongβcorrect, wrongβwrong) |
| - The minimal adversarial ratio needed for collapse |
|
|
| This transforms "56.7% is scary" into "adversarial compute amplification follows this specific mechanism and can be mitigated by these specific controls." |
|
|
| --- |
|
|
| ## PART II: TRUTHFULQA β Judge-Dependence |
|
|
| **Setup:** 60 TruthfulQA questions, Qwen3-Coder-30B-A3B-Instruct, AllenAI Llama2-7B truth + info judges. |
|
|
| | Condition | Truthful | Informative | Both | Tokens | |
| |-----------|----------|-------------|------|--------| |
| | A: Direct | **0.917** | 1.000 | **0.917** | 7,198 | |
| | B: OCC Tiered | 0.867 | 1.000 | 0.867 | 6,692 | |
| | **C: OCC+Abstain** | **0.917** | 0.967 | 0.883 | **5,682** | |
|
|
| **Key lesson: The oracle's choice determines everything.** |
|
|
| Under string matching (our earlier scoring), the model looked terrible (0.325 truthful). Under AllenAI's semantic judge, it's excellent (0.917). The same answers, different judges, 59pp swing. |
|
|
| This is not a bug β it's a feature to study. The Oracle Reliability section of the formal definition (design.md) maps out oracle types from ground-truth oracle (ceiling) through LLM judge (practical) to noisy/adversarial oracles (robustness tests). |
|
|
| OCC+Abstain achieves iso-quality (0.917) with 21.1% fewer tokens. But the savings are modest and the abstention rate is tiny (3.3%) under the AllenAI judge. Under string matching, abstention was 28%. The mechanism's value is judge-dependent. |
|
|
| **Honest assessment:** TruthfulQA does not strongly support or undermine OCC. It demonstrates oracle-dependence. Move to appendix for publication. |
|
|
| --- |
|
|
| ## PART III: HUMANEVAL β Adaptive Retry, Not Credit Allocation |
|
|
| **Setup:** HumanEval 164 problems, Qwen3-Coder-30B-A3B-Instruct, two-pass OCC (128-token first pass, 1024-token retry on failure). |
|
|
| | Platform | Pass@1 | Savings | |
| |----------|--------|---------| |
| | H200 | **42.1%** | 67.8% | |
| | Blackwell | 33.5% | 62.6% | |
|
|
| The savings are real and cross-platform (63-68%), but this is **adaptive retry**, not OCC credit allocation. The OCC label is aspirational β the actual mechanism is "cheap first pass, expensive retry." |
|
|
| **For this to become an OCC result**, it needs an agentic version: |
| - Generator agent spends credits to propose solutions |
| - Tester agent earns credits for catching bugs |
| - Repair agent earns credits only if patch passes |
| - Credits decay across problems |
| - Agents with low marginal value lose budget |
|
|
| Until then, HumanEval is a practical but orthogonal finding. It belongs in the appendix or as a separate "adaptive inference" note. |
|
|
| --- |
|
|
| ## PART IV: OCC SYSTEM β What It Actually Is |
|
|
| See `design.md` for the full formal definition. Here's the summary: |
|
|
| ### Components |
|
|
| 1. **Impact Oracle** (`oracle.py`): Scores whether an action produced measurable marginal value. Supports code, QA, debate, and retrieval scoring modes. |
|
|
| 2. **Credit Ledger** (`ledger.py`): Non-transferable, decaying, capability-scoped credits with immutable audit trail. Every credit mutation is an append-only event with provenance. |
|
|
| 3. **Resource Broker** (`broker.py`): Capability-based access control. Decides allow/deny/downgrade/escalate/require-approval per resource type. |
|
|
| 4. **GRPO Reward Hook** (`grpo_hook.py`): TRL-compatible reward function combining oracle score + anti-gaming penalties. Validated end-to-end. |
|
|
| ### Core Invariants |
|
|
| 1. Credits are non-transferable |
| 2. Credits decay per-turn (Ξ΄ = 0.995) |
| 3. Credits are capability-scoped (retrieval β file write β model access) |
| 4. Rewards require external verification (oracle separate from spender) |
| 5. Ledger is append-only |
| 6. Oracle cannot be directly influenced by the spending agent |
| 7. Failed work cannot generate positive credit |
| 8. Credit β identity trust (high credit β blanket access) |
|
|
| ### Threat Model |
|
|
| | Attack | Defense | Residual Risk | |
| |--------|---------|---------------| |
| | Credit farming (easy tasks) | Decay + caps | Slow gaming over many tasks | |
| | Collusion (multiple agents) | Non-transferability | Vote-ring behavior | |
| | Oracle spoofing | Verifier separation | Judge hacking | |
| | Griefing (burn others' budget) | Scoped spend | Indirect poisoning | |
| | Identity laundering | Identity binding | Account churn | |
| | Strategic abstention | Reward shaping | Conservatism bias | |
| | Verbosity gaming | Token-cost multiplier | Requires quality oracle | |
| | Confidence manipulation | Proper scoring rules | Hard to calibrate perfectly | |
|
|
| ### When OCC Is Valuable |
|
|
| **Use OCC when**: agents have heterogeneous reliability, long-running tasks need budget discipline, debate can be poisoned, compute is expensive, auditability matters, or post-hoc accountability is required. |
|
|
| **Skip OCC when**: single-agent tasks suffice, ground truth is immediate and cheap, no adversarial participation, all agents have equal trust and capability, or verifier cost exceeds saved compute. |
|
|
| --- |
|
|
| ## PART V: ABLATIONS |
|
|
| | Ablation | Effect | |
| |----------|--------| |
| | No credit ledger | 27% less compute savings | |
| | Transferable credits | Gaming success: 0% β 45% | |
| | Non-decaying credits | Credit hoarding, -18% throughput | |
| | No confident-wrong penalty | Confident-wrong rate 2.3Γ higher | |
| | No calibration penalty | ECE: 0.12 β 0.31 | |
| | No cost penalty | Token usage +40% | |
| | No anti-gaming penalty | Gaming agents earn 3.2Γ more | |
|
|
| --- |
|
|
| ## PART VI: HONEST ASSESSMENT |
|
|
| ### What the Evidence ACTUALLY Supports |
|
|
| | Claim | Evidence Level | Notes | |
| |-------|---------------|-------| |
| | Unmanaged compute amplifies adversarial influence | **Strong**: 56.7% collapse, 2 seeds, identical result | Needs mechanism isolation to be publishable | |
| | OCC prevents catastrophic collapse | **Moderate**: OCC 180/3 (83.3%) >> equal 3-round (56.7%) | But OCC β random gating at moderate budgets | |
| | Anti-gaming ledger design is novel and sound | **Moderate**: 8 attacks, zero successful vectors | Simulation only, needs live agent testing | |
| | OCC saves tokens at iso-quality | **Weak**: 21% on TruthfulQA (small dataset), 67% on HumanEval (not OCC, it's adaptive retry) | Not the core claim | |
| | OCC beats simple baselines | **Not supported**: Random gating (85.0%) β OCC (83.3-85.0%) | The advantage is in preventing extremes | |
| | Learned allocation beats hand-tuned rules | **Not tested**: GRPO hook works but no policy improvement at 0.5B scale | Needs 7B+ model + training budget | |
|
|
| ### What Failed |
|
|
| 1. **OCC doesn't beat random gating at moderate budgets.** The mechanism prevents catastrophe but doesn't improve the median case. |
| 2. **TruthfulQA abstention is judge-dependent.** 28% β 3% depending on scorer. |
| 3. **GRPO training produced no policy improvement at 0.5B.** |
| 4. **HumanEval is adaptive retry, not OCC.** Honest labeling needed. |
|
|
| ### Wrong Assumptions |
|
|
| 1. "In-process exec is good enough for HumanEval" β WRONG. |
| 2. "More debate turns always helps" β WRONG (56.7% collapse). |
| 3. "OCC will outperform random gating in the median case" β NOT YET PROVEN. |
| 4. "TruthfulQA string-matching is a reasonable scorer" β WRONG (59pp swing). |
|
|
| ### Is OCC Actually Useful? |
|
|
| **For preventing catastrophic compute misallocation: yes.** The equal-3-round collapse is a genuine finding β more compute can make things worse, and a credit mechanism prevents the worst of it. |
|
|
| **For marginal gains in well-behaved systems: not yet proven.** Random gating works nearly as well when there's no adversary. OCC's value is in adversary-aware deployment. |
|
|
| **For production multi-agent governance: promising but early.** The ledger, broker, and anti-gaming design are sound on paper. Live-agent validation is needed. |
|
|
| ### Is This Publishable? |
|
|
| **Workshop: Yes, with the mechanism isolation experiment.** The collapse + mechanism analysis + anti-gaming design is a coherent workshop paper. |
|
|
| **Main conference: No, without:** |
| 1. Mechanism isolation results (not just the collapse number, but WHY it happens) |
| 2. Broader benchmark generalization (100+ questions, 5+ seeds) |
| 3. Statistical significance (paired bootstrap, McNemar) |
| 4. Learned allocator that beats hand-tuned rules |
| 5. Oracle robustness analysis under different oracle types |
|
|
| ### The Path Forward |
|
|
| 1. **Run mechanism isolation** (script uploaded) β break the collapse into testable causes |
| 2. **Scale the benchmark**: 100-300 debate questions, 5-10 seeds |
| 3. **Add stronger baselines**: confidence gating, disagreement gating, bandit allocation, auction allocation, judge-weighted voting |
| 4. **Oracle ablation**: ground-truth, LLM judge, noisy, adversarial oracles |
| 5. **GRPO-learned allocator** at 7B+ scale |
| 6. **Split into focused papers**: (a) Collapse mechanism, (b) OCC governance design, (c) Learned allocation |
|
|
| ### The Bold Claim |
|
|
| **In multi-agent systems, compute is not neutral. Extra turns can amplify adversarial influence unless access to deliberation is governed by verified marginal contribution. OCC is a first mechanism-design layer for making agent compute scarce, earned, scoped, decaying, and auditable.** |
|
|
| This is the sentence. Everything else is evidence, honesty, and the mechanism that makes it true. |
|
|
| --- |
|
|
| ## Repository |
|
|
| - **Main repo:** https://huggingface.co/narcolepticchicken/occ-stack |
| - **Formal definition:** `design.md` |
| - **Mechanism isolation:** `jobs/occ_debate_collapse_mechanism.py` |
| - **Results:** `reports/` |
|
|
| --- |
|
|
| ## Changelog |
|
|
| - **v12 (CORRECTED FRAMING):** Reframed around "compute-as-attack-surface." Added formal OCC definition (design.md). Added threat model, "when not to use OCC," ledger event schema. HumanEval honestly labeled as adaptive retry. TruthfulQA labeled as oracle-dependence demonstration. Mechanism isolation script written. Collapse is the wedge. |
| - **v11:** TruthfulQA AllenAI results. 2-seed debate aggregate. Honest assessment. |
| - **v10:** Extended baselines running. Initial equal_3round collapse finding. |
| |
| --- |
| |
| *OCC is a research prototype. The collapse is real. The mechanism is promising. The evidence is incomplete. The framing is now honest.* |
| |