Attuned Resonance Intake — Multi-Head RoBERTa for Call-Center Intake

Six-head classifier/regressor built on roberta-base. Given a call transcript, emits a structured intake record used downstream by the outcome predictor and the PPO router.

The GitHub repo slug is still "CEPM" during the gradual rename; the Hugging Face slugs were migrated to attuned-resonance-* on 2026-05-09 (HF preserves the old cepm-* URLs as redirects).

Research/educational use only. See disclaimer below.

Heads

Head Type Output
intent Softmax (8 classes) Caller domain (zip-source vocabulary)
action Softmax (7 classes) Call type in the predictor's call_type vocabulary — bridges the cascade NLP→predictor
sentiment Regression (scalar, approx −1..1) Overall emotional valence
urgency Softmax (3 classes, 1–3) Time-sensitivity
complexity Softmax (5 classes, 1–5) Required skill level
jung Softmax (12 classes) Caller Jung archetype (speculative head, 0.5× loss)
campbell Softmax (10 classes) Campbell hero-journey phase (speculative head, 0.5× loss)

Backbone: RoBERTa encoder, last_hidden_state[:, 0] ([CLS]) shared across all heads. Pooler dropped (add_pooling_layer=False) since the heads consume CLS directly. Joint multi-task loss: intent_CE + action_CE + sentiment_MSE + urgency_CE + complexity_CE + 0.5·jung_CE + 0.5·campbell_CE.

Intent classes (8, zip-source vocabulary)

Sourced from _ZIP_TO_INTENT in models/intake/dataset.py. Indices are sorted(set(...)) order so they're reproducible across runs:

  1. auto_insurance_customer_service_inbound
  2. automotive_and_healthcare_insurance_inbound
  3. automotive_inbound
  4. customer_service_general_inbound
  5. home_service_inbound
  6. insurance_outbound
  7. medical_equipment_outbound
  8. medicare_inbound

Action classes (7, predictor's call_type vocabulary)

The action head exists to fix a silent cascade bug: the intent head emits zip-source labels (above), but the downstream predictor's _INTENT_MAP is keyed on the synthetic call_types it was trained against. Without the action head, every NLP intent was silently mapped to predictor class 0. The action head emits in the predictor's native vocabulary so the cascade carries a feature the predictor actually understands.

Indices match generator/config.py:call_types (do not re-sort):

  1. billing
  2. technical
  3. general
  4. account
  5. complaint
  6. cancellation
  7. upgrade

Silver-label heuristic + class distribution

Action labels are silver-standard, generated by a deterministic keyword heuristic (_action_label_for in models/intake/dataset.py) with priority order: cancellation > upgrade > technical > billing > account > complaint > general. Distribution on the 90,093-record corpus:

Class Share Notes
general 56% Corpus-driven (Medicare ≈ 68% of records, mostly coverage clarifications)
billing 20%
cancellation 11%
technical 7%
upgrade 5% "premium" keyword removed after 82% false-positive rate on insurance_outbound (matched "your premium will be …")
complaint 1%
account 1%

Imbalance is corrected at training time by a sqrt-tempered inverse-frequency class-weight vector applied to action_CE:

  • Linear 1/n would give account a ~100× weight vs general — too aggressive for silver labels.
  • Sqrt-tempered (mean-normalized) compresses that to ~6.6× (general 0.29 vs account 1.97), correcting the imbalance without overfitting to whichever heuristic mistakes happen to land in the rare classes.

Training (B2 retrain, 2026-04)

  • Backbone: FacebookAI/roberta-base (125M params, pooler removed)
  • Data: 90,093 anonymized real inbound call transcripts across 8 domains; silver labels for the non-intent heads
  • Optimizer: AdamW, lr=2e-5, weight_decay=0.01, linear warmup 10%
  • Batch size: 32 per GPU × 2 GPUs = 64 effective, max_length=512
  • Epochs: 4
  • Hardware: 2× NVIDIA H100 SXM 80GB (RunPod), DDP via accelerate launch --num_processes=2
  • Tokenizer: RobertaTokenizerFast (Rust-backed, byte-identical to slow)
  • Tracked: MLflow experiment prod-intake-b2

Metrics (best checkpoint, epoch 4)

Metric Value
train_loss (sum of 7 head losses) 5.5984
val_loss 5.6088
Best val_loss 5.6088
Wall clock 25 min total (22 min training, ~3 min upload)

Train ≈ val throughout (gap < 0.01 at epoch 4) → no overfit signal at 4 epochs. Per-epoch deltas were shrinking (-0.07, -0.04, -0.03, -0.03), indicating the model was approaching convergence; further epochs at decayed LR would likely buy diminishing returns.

The combined loss adds the action head's weighted CE on top of the prior 5-head sum, so the absolute loss is ~1.5–2 units higher than the pre-B2 5-head run (3.99 val); the shape (monotone decrease, train≈val) is the like-for-like comparison.

Intended Use

  • Research on multi-head transformer fine-tuning + cascade-aware vocabulary design
  • Educational demonstrations of NLP → downstream model cascades
  • Benchmarking against the synthetic-data environment in this repo

Out of Scope / Not Intended

  • Any production or commercial use. Not validated for operational deployment.
  • Customer-facing decision-making. The archetype heads (Jung/Campbell) are exploratory and have low calibrated confidence.
  • Any setting where misclassification carries material risk.

Limitations

  • Silver-labeled action head. The action labels come from a keyword heuristic, not human annotation. Class-weight tempering helps, but the head's accuracy is bounded by heuristic accuracy on the rare classes.
  • Archetype heads remain speculative. Confidence is typically <15% on real samples. Treat as exploratory.
  • English-only. Transcripts were US English with finance/medicare/automotive bias; no telecom data in the current corpus.
  • Convergence headroom. 4 epochs at LR 2e-5 was a deliberate stop after the elbow of the loss curve; longer runs at decayed LR may improve val loss slightly.

Future Work

  • Telecom call-type vocabulary. When the corpus expands to telecom transcripts, the action head's vocabulary will migrate to a customer-facing taxonomy: Internet, TV, Mobile, Cancel, Retention, Billing, Other. The existing 7-class call_type vocab (billing/technical/general/account/complaint/cancellation/upgrade) is operationally accurate for the current insurance/medicare/automotive corpus but would feel awkward for telecom. Tracked in project_v21_plan.md.
  • Predictor refresh against the new intake feature distribution (action-channel cascade is now active; predictor was trained against the old intent-channel fallback).
  • Router (PPO) training to close the v2.1 three-model loop.

Changelog

2026-04 — B2 retrain (this card)

  • 6th head added (action, 7-class) bridging the cascade NLP→predictor. Architecture + silver-label heuristic in commit 4933d68; trainer + class weights in 5f517b4; pooler removal for DDP in 1605261.
  • Retrained on 2× H100 SXM via accelerate launch --num_processes=2. 4 epochs, max_length=512, effective batch 64. Best val_loss 5.6088.
  • Sqrt-tempered inverse-frequency class weights applied to action loss to correct the 56%/1% imbalance without overfitting silver-label noise.
  • Cascade switch in notebooks/research.ipynb Section 9b now feeds the predictor intake["action"] (call_type vocab) instead of intake["intent"] (zip-domain vocab); the latter silently fell back to class 0 inside the predictor.

2026-04 — clean-schema retrain

  • Replaced the per-domain .get(domain, 0) label fallback with a strict whitelist (_ZIP_TO_INTENT) and INTENT_LABELS = sorted(set(...)). Previously, three malformed/duplicate zips were silently mislabeled as class 0 (label bleed).
  • Excluded zips: a typo-and-mixed-domain bundle (home_ervice_inbound&telecom _outbound) and two (reupload) duplicates whose transcript counts matched the originals.
  • Intent classes: 10 → 8; transcripts: 95,831 → 90,093.

How to Load

from models.intake.inference import IntakePredictor
from pathlib import Path

predictor = IntakePredictor(
    model_path=Path("trained_models/intake/model.pt"),
    device="cpu",  # or "cuda"
)

result = predictor.predict("Hi, I need help with my account...")
# {
#   'intent': 'medicare_inbound',           # zip-domain vocab
#   'action': 'account',                    # predictor's call_type vocab — feed this to the cascade
#   'sentiment': 0.1, 'urgency': 2, 'complexity': 3,
#   'jung_type': '...', 'campbell_phase': '...',
#   'archetype_confidence': 0.13,
# }

Full pipeline code at tedrubin80/CEPM.

License

CC-BY-NC-4.0. Non-commercial use only. Attribution required.

Citation

@software{attuned_resonance_intake_2026,
  author = {Rubin, Ted},
  title = {Attuned Resonance Intake: Multi-Head RoBERTa for Call-Center Intake},
  year = {2026},
  url = {https://github.com/tedrubin80/CEPM}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for datamatters24/attuned-resonance-intake

Finetuned
(2276)
this model