EvaluatorModel / README.md
pcsankar73s's picture
Update model card
f8a9ecf verified
|
raw
history blame
7.04 kB
---
license: cc-by-nc-4.0
language: en
tags:
- decision-making
- auditable-ai
- bounded-decisions
- multi-task
- transformers
- explainability
- confidence-scoring
- human-values
- sentiment-analysis
metrics:
- f1
- accuracy
pipeline_tag: text-classification
inference: true
gated: true
extra_gated_prompt: "Access is provided for research and evaluation use only. Redistribution, commercial use, or publication of model weights is not permitted without written approval from Simple Machine Mind."
extra_gated_fields:
Organization: text
Intended use:
type: select
options:
- Research
- Evaluation
- Commercial evaluation
- Other
I agree to the access terms: checkbox
---
# Evaluator v2 β€” Auditable AI Decision System (EvaluatorDPT)
**Model ID:** `pcsankar73s/EvaluatorModel`
**License:** CC BY-NC 4.0 (non-commercial; approval required for inference)
**Access:** πŸ”’ Gated β€” visible to all, usable only with explicit approval
**Author:** Sankaranarayanan Palamadai Chandrasekaran Β· [Simple Machine Mind](https://www.smsquared.ai)
---
## Overview
Most AI systems are built to always give an answer β€” even when they shouldn't. EvaluatorDPT is built differently: it reads structured signals, doesn't generate text, and produces a bounded decision of **YES**, **NO**, or **defer to a human**. Because it is signal-based and deterministic, it doesn't hallucinate. When it flags a case as uncertain, it is right to do so **93% of the time** (TBD precision: 0.9306). The deferral threshold is tunable at deployment β€” teams can steer decisions toward their risk tolerance or business objective without retraining the underlying model.
EvaluatorDPT is a BERT-based multi-task model for **auditable decision control under ambiguity**. It produces a bounded three-class decision (YES / NO / TBD) alongside structured auxiliary outputs that remain available at inference time as explainability signals and control variables.
Unlike conventional classifiers that force a binary output regardless of evidence quality, EvaluatorDPT treats **TBD (defer)** as a trained first-class outcome β€” enabling uncertain cases to be routed to conservative handling without retraining the core model.
The model predicts:
- **Decision** β€” YES / NO / TBD (defer)
- **Auxiliary Head 1** β€” Detects sentiment turbulence: emotional noise affecting decision clarity (28 labels)
- **Auxiliary Head 2** β€” Captures semantic value signals: ethical anchors such as fairness or caution (10 labels)
Auxiliary outputs are **retained at inference time** as structured control variables for downstream steering, thresholding, and reason-code generation.
Input/output contract: a context signal is mapped to a bounded decision, decision confidence, structured reason codes, and reason-code confidence scores.
---
## Architecture
**Backbone:** `bert-base-uncased` (12-layer Transformer)
**Heads:**
- `decision` β€” primary 3-class classifier (YES / NO / TBD) with confidence score
- `auxiliary_head_1` β€” multi-label signal layer for sentiment turbulence (28 labels)
- `auxiliary_head_2` β€” multi-label signal layer for value alignment (10 labels)
All inputs are tokenized to a maximum sequence length of 128 tokens.
**Training recipe:** Gradual unfreeze β†’ full unfreeze Β· LR = 1e-5 Β· Batch size = 32 Β· Early stopping (patience = 2) Β· Threshold sweep Β· Layer-wise differential learning rates Β· Cosine decay with warmup ratio 0.1 Β· Class weights on decision head for imbalance handling
---
## Performance
Trained on **181,000** curated decision events. Evaluated on a stratified held-out test split of **22,748 examples** (TBD majority class at 60.3%).
| Method | Accuracy | Macro F1 | Micro F1 | Weighted F1 |
|---|---|---|---|---|
| Majority baseline (always TBD) | 0.6030 | 0.2508 | 0.6030 | 0.4537 |
| **EvaluatorDPT** | **0.8485** | **0.8215** | **0.8485** | **0.8506** |
**Per-class breakdown:**
| Class | Precision | Recall | F1 | Support |
|---|---|---|---|---|
| YES | 0.7683 | 0.9029 | 0.8302 | 5,871 |
| NO | 0.7164 | 0.7923 | 0.7524 | 3,159 |
| TBD | 0.9306 | 0.8381 | 0.8819 | 13,718 |
**Inference latency** (NVIDIA Tesla T4 GPU, 200 runs): p50 = 200 ms Β· p95 = 415 ms
---
## Data Processing Modules
| Included for Further Progress | Cited (for Reference / Citation) |
|---|---|
| process_semeval2017_local | process_sentiment140 |
| process_financial_phrasebank | process_imdb |
| process_tweeteval | process_multinli |
| process_goemotions | process_tweeteval_health |
| process_normbank_csv_concatenated | |
| process_mft_from_json | |
| process_meld | |
| process_empathetic_dialogues | |
| process_social_bias_frames | |
| process_ethics_local | |
| process_ethics_virtue | |
---
## Use Cases
**Decision gating under ambiguity** β€” route inputs to YES, NO, or deferred handling based on evidence quality without forcing a binary commit.
**Auditable AI workflows** β€” every decision ships with a confidence score, value alignment signal, and sentiment turbulence signal that downstream systems can log, inspect, and act on.
**Risk-sensitive deployments** β€” use TBD precision (0.9306) and confidence scores to calibrate the YES execution threshold for deployment-specific risk tolerance without retraining.
**Reason-code generation** β€” auxiliary outputs provide structured context for human-readable explanations alongside each decision.
---
## Example Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("pcsankar73s/EvaluatorModel")
model = AutoModelForSequenceClassification.from_pretrained("pcsankar73s/EvaluatorModel")
inputs = tokenizer(
"Should we proceed given the current context?",
return_tensors="pt",
max_length=128,
truncation=True,
)
outputs = model(**inputs)
# outputs.logits β†’ decision probabilities (YES / NO / TBD)
# confidence score derived from softmax of decision logits
```
---
## Limitations
- Results are specific to the training distribution; generalization to other domains requires separate validation.
- Class imbalance in the NO class (13.9% of test split) limits NO performance; targeted sampling may improve this.
- Inputs exceeding 128 tokens are truncated; longer documents require chunking or preprocessing.
- Reported latency is hardware-dependent; re-characterize for your inference environment.
- Auxiliary heads provide structured signals, not ground-truth classifiers for values or emotions.
---
## Links
- GitHub: [pcsankar73/EvaluatorDPT-Publish](https://github.com/pcsankar73/EvaluatorDPT-Publish)
- OSF preprint: [https://osf.io/ztnya/](https://osf.io/ztnya/)
- Paper (arXiv): TBD
- Contact: sankar@smsquared.ai
---
## License
Model artifacts: [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) β€” non-commercial use; contact for commercial licensing.
Code and documentation: see repository [LICENSE](https://github.com/pcsankar73/EvaluatorDPT-Publish/blob/main/LICENSE).
---