Instructions to use yafitzdev/pyrrho-nano-g3.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use yafitzdev/pyrrho-nano-g3.1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="yafitzdev/pyrrho-nano-g3.1")# Load model directly from transformers import AutoTokenizer, PyrrhoMultiTaskModernBert tokenizer = AutoTokenizer.from_pretrained("yafitzdev/pyrrho-nano-g3.1") model = PyrrhoMultiTaskModernBert.from_pretrained("yafitzdev/pyrrho-nano-g3.1") - Notebooks
- Google Colab
- Kaggle
pyrrho-nano-g3.1
pyrrho-nano-g3.1 is a small multitask RAG governance co-processor for anti-hallucination and retrieval-quality pipelines. It reads a user question plus retrieved source passages, then returns a calibrated evidence-state decision and auxiliary signals that fitz-sage can use before answer generation.
It is not an answer generator and not an open-world fact checker. It sits between
retrieval and generation, or beside a retrieval package as a fast evidence
quality layer. Compared with pyrrho-nano-g3, this package adds multitask heads
for pre-retrieval query-contract classification, semantic route/domain, taxonomy
pattern, and six scalar governance signals.
Governance Labels
| Label | Meaning |
|---|---|
ABSTAIN |
The retrieved sources do not contain enough evidence to answer the question. |
DISPUTED |
The retrieved sources conflict on the answer. |
TRUSTWORTHY |
The retrieved sources consistently support answering the question. |
Multitask Heads
| Head | Labels / values | Intended use |
|---|---|---|
governance |
ABSTAIN, DISPUTED, TRUSTWORTHY |
Post-retrieval evidence sufficiency and conflict decision. |
query_contract |
evidence_sufficiency, structured_lookup, temporal_grounding, exhaustive_coverage, comparison_coverage, representative_overview |
Pre-retrieval routing signal for what kind of evidence the query needs. |
route |
science_medicine, law_policy, history_geography, technology_computing, economics_finance, culture_society, general_commonsense |
Semantic route/domain signal for retrieval policy and logging. |
taxonomy |
23 fitz-gov taxonomy patterns | Failure/support pattern signal for audit and diagnostics. |
scalars |
evidence_sufficiency, query_evidence_alignment, answer_coverage, conflict_density, retrieval_retry_value, false_trustworthy_risk |
Continuous governance signals for retry, ranking, and monitoring. |
Outputs
This is a custom multitask package, not a standard single-head
AutoModelForSequenceClassification artifact. The recommended runtime is
pyrrho.multitask_inference.PyrrhoMultiTaskPredictor from the pyrrho repository.
The predictor returns a structured object:
| Field | Meaning |
|---|---|
governance.final_label |
Final calibrated label after the TRUSTWORTHY threshold rule. |
governance.raw_label |
Highest-probability governance label before threshold calibration. |
governance.probabilities |
Probability distribution over ABSTAIN, DISPUTED, TRUSTWORTHY. |
governance.threshold |
TRUSTWORTHY probability threshold used by the package. |
query_contract.final_label |
Query-only contract prediction. |
route.final_label |
Query-only semantic route/domain prediction. |
taxonomy.final_label |
Query+evidence taxonomy-pattern prediction. |
scalars |
Six bounded scalar governance signals. |
timing_ms |
Local inference timing for the call. |
Example normalized output shape:
{
"schema_version": "pyrrho_multitask_prediction_v1",
"governance": {
"raw_label": "TRUSTWORTHY",
"final_label": "TRUSTWORTHY",
"used_threshold_fallback": false,
"threshold": 0.39,
"confidence": 0.84,
"probabilities": {
"ABSTAIN": 0.08,
"DISPUTED": 0.08,
"TRUSTWORTHY": 0.84
}
},
"query_contract": {
"final_label": "structured_lookup"
},
"route": {
"final_label": "economics_finance"
},
"taxonomy": {
"final_label": "direct_answer"
},
"scalars": {
"evidence_sufficiency": 0.91,
"query_evidence_alignment": 0.88,
"answer_coverage": 0.86,
"conflict_density": 0.08,
"retrieval_retry_value": 0.12,
"false_trustworthy_risk": 0.09
}
}
The model does not generate answers, citations, source spans, retrieval results,
or natural-language explanations. It classifies and scores the (query, retrieved_contexts) evidence state.
Intended Use
Use this model when a RAG or retrieval package needs fast local signals about:
- whether retrieved evidence is enough to answer,
- whether retrieved evidence conflicts,
- what kind of evidence the query needs before retrieval,
- which semantic/domain route the query belongs to,
- which fitz-gov support/failure pattern is active,
- whether retrieval should retry, broaden, or escalate.
This model is not intended to write answers, verify facts outside the provided sources, replace a retriever, or replace human review in high-stakes settings.
Quick Start
Install the pyrrho package from the repository that contains this runtime, then load the package with the multitask predictor:
from huggingface_hub import snapshot_download
from pyrrho.multitask_inference import PyrrhoMultiTaskPredictor
MODEL_ID = "yafitzdev/pyrrho-nano-g3.1"
PACKAGE_DIR = snapshot_download(MODEL_ID)
query = "Which quarterly report is relevant?"
contexts = [
"The Q2 report lists revenue, churn, and roadmap changes.",
]
predictor = PyrrhoMultiTaskPredictor.from_pretrained(PACKAGE_DIR, device="cpu")
result = predictor.predict(query, contexts)
print(result["governance"]["final_label"])
print(result["query_contract"]["final_label"])
print(result["route"]["final_label"])
print(result["taxonomy"]["final_label"])
print(result["scalars"])
For local package testing:
python scripts/package_multitask_encoder.py verify --package-dir models/pyrrho-nano-g3.1 --device cpu
Release Selection
- Seed:
7 - TRUSTWORTHY threshold:
0.39 - Selection reason: seed
7had the strongest composite release score while retaining strong governance, query-contract, route, taxonomy, and scalar metrics.
Held-Out Test Metrics
| Metric | Result |
|---|---|
| Governance accuracy | 0.9805 |
| False-TRUSTWORTHY rate | 0.0095 |
| Query-contract accuracy | 0.9492 |
| Query-contract macro F1 | 0.9423 |
| Route accuracy | 0.9296 |
| Route macro F1 | 0.9282 |
| Taxonomy accuracy | 0.8943 |
| Taxonomy macro F1 | 0.8960 |
| Scalar MAE | 0.0587 |
Three-seed headline from the local release summary:
| Metric | Mean +/- std |
|---|---|
| Governance accuracy | 97.84 +/- 0.15% |
| False-TRUSTWORTHY rate | 0.85 +/- 0.07% |
| Query-contract macro F1 | 94.24 +/- 0.28% |
| Route accuracy | 93.41 +/- 0.32% |
| Taxonomy accuracy | 89.26 +/- 0.23% |
| Scalar MAE | 0.0592 +/- 0.0005 |
Training Data
Trained on fitz-gov V8.1-style rows prepared from the V8.0.1 row set plus the
mandatory routing.query_contract field. The release package records the local
training config in training_config.yaml and detailed metrics in
reports/summary.json.
Limitations
- This is a governance and routing co-processor, not a generator.
- The auxiliary heads are useful signals, not ground-truth explanations.
- Query-contract and route predictions are query-only and can be wrong when the user query is underspecified.
- Taxonomy and scalar outputs are trained on fitz-gov labels/signals and should be treated as decision-support metadata, not universal factual judgments.
- The license is CC BY-NC 4.0. Commercial use requires a separate license.
- Downloads last month
- 34
Model tree for yafitzdev/pyrrho-nano-g3.1
Base model
answerdotai/ModernBERT-base