pyrrho-MoE-g3-alpha

pyrrho-MoE-g3-alpha is a CPU-runnable research alpha for RAG evidence governance. It reads a user query plus retrieved contexts and predicts one of ABSTAIN, DISPUTED, or TRUSTWORTHY.

This alpha is intentionally not the final custom pyrrho-MoE 4B-A0.4B sparse model. It packages the working Stage 0.7 support-aggregation prototype across three seeds plus post-hoc verifier rerankers. The default release policy is trustworthy_quorum_2_of_3, which predicts TRUSTWORTHY only when at least two of the three guarded seeds agree.

What Is Packaged

Path Contents
manifest.json Release manifest, seed paths, hashes, thresholds, feature schema, and default policy metrics.
seeds/seed_{42,1337,7}/model.pt Stage 0.7 support-aggregation checkpoints.
seeds/seed_{42,1337,7}/verifier.joblib Per-seed post-hoc HGB verifier.
seeds/seed_{42,1337,7}/verifier_report.json Per-seed verifier metrics and thresholds.
config/pyrrho_moe_stage0_7_support_aggregation.yaml Training/runtime config for the packaged checkpoints.
metadata/metadata.json Route, taxonomy, scalar-field, and label metadata needed for raw local inference.
reports/posthoc_policy_compare_* Full eval/test policy comparison for the packaged verifier outputs.

The checkpoint path is local and PyTorch-native. This is not a transformers.AutoModel release.

Labels

Label Meaning
ABSTAIN Retrieved sources do not contain enough evidence to answer.
DISPUTED Retrieved sources conflict on the answer.
TRUSTWORTHY Retrieved sources consistently support answering.

Results

Metrics are on the local fitz-gov V8 MoE split (train=19,674, eval=2,459, test=2,459). The validation split selected checkpoint thresholds and verifier thresholds; the held-out test split is final reporting.

Operating point Split Accuracy False-TRUSTWORTHY TRUSTWORTHY recall
Per-seed guarded mean eval 89.33 +/- 0.06% 2.76 +/- 0.00% 82.54 +/- 0.75%
Per-seed guarded mean test 89.29 +/- 0.69% 2.37 +/- 0.26% 80.20 +/- 1.64%
trustworthy_quorum_2_of_3 eval 90.81% 1.88% 83.77%
trustworthy_quorum_2_of_3 test 90.65% 1.90% 81.71%

The default policy is the held-out trustworthy_quorum_2_of_3 result: 90.65% accuracy / 1.90% false-TRUSTWORTHY / 81.71% TRUSTWORTHY recall.

Quick Start

This package is meant to run from a checkout of the pyrrho repository with its Python environment installed.

Input JSONL accepts raw RAG rows:

{"id": "case-1", "query": "Has the company achieved profitability?", "contexts": ["The company posted net income of $4 million.", "A later filing reported a quarterly loss of $12 million."]}

Run one seed:

python scripts\infer_moe_posthoc.py `
  --package-dir models\pyrrho-MoE-g3-alpha `
  --seed 42 `
  --input examples.jsonl `
  --output predictions_seed42.jsonl

Run the default alpha policy:

python scripts\infer_moe_posthoc.py `
  --package-dir models\pyrrho-MoE-g3-alpha `
  --policy trustworthy_quorum_2_of_3 `
  --input examples.jsonl `
  --output predictions_quorum.jsonl

Each seed output includes the base label, guarded label, governance probabilities, selected route, taxonomy pattern, scalar predictions, verifier accept score, and verifier rejection flag. The quorum output includes the final policy label, per-seed labels, per-seed rejection flags, and mean governance probabilities.

Reload Checks

Hash-verified package load:

python scripts\infer_moe_posthoc.py `
  --package-dir models\pyrrho-MoE-g3-alpha `
  --policy trustworthy_quorum_2_of_3 `
  --device cpu `
  --input data\moe_v8\test.jsonl `
  --max-samples 8 `
  --output models\pyrrho-MoE-g3-alpha\inference_quorum2_smoke.jsonl

Full metric reproduction can be run locally with the prepared data directory:

python scripts\package_moe_posthoc_verifier.py evaluate `
  --package-dir models\pyrrho-MoE-g3-alpha `
  --data-dir data\moe_v8 `
  --split both `
  --device cpu `
  --output models\pyrrho-MoE-g3-alpha\package_eval_report.json

Intended Use

Use this artifact for research, local RAG-governance experiments, policy comparisons, and CPU runtime prototyping. It is useful when the priority is a low false-TRUSTWORTHY decision signal over retrieved contexts and when a three-forward local ensemble is acceptable.

Do not treat this as the final production pyrrho-MoE architecture. The terminal path remains a custom CPU-runnable 4B total / 0.4B active sparse model with pyrrho-defined experts and supervised routing. Current Qwen/upcycled-style work is not included in this alpha and is not release quality.

Limitations

  • Three Stage 0.7 seed forwards are required for the default policy.
  • The prototype uses hash-tokenized PyTorch checkpoints, not a general pretrained tokenizer/Transformers artifact.
  • The verifier is a packaged post-hoc reranker over frozen Stage 0.7 outputs.
  • English-only evaluation.
  • The model judges only the supplied contexts; it does not retrieve evidence or verify facts against outside knowledge.
  • The artifact is a research alpha and should be evaluated before any high-stakes deployment.

Citation

@misc{pyrrho_moe_g3_alpha_2026,
  title  = {pyrrho-MoE-g3-alpha},
  author = {Yan Fitzner},
  year   = {2026},
  url    = {https://huggingface.co/yafitzdev/pyrrho-MoE-g3-alpha},
}

License

CC BY-NC 4.0. Free for research, evaluation, and personal use; commercial use requires a separate license.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train yafitzdev/pyrrho-MoE-g3-alpha