Veridian.ai — Lost Quote Intelligence System (LQIS)

Real-time, explainable deal-loss prediction for buyer–seller negotiations, with rep-level coaching. Given a running dialogue, the model outputs a loss probability (updated turn-by-turn), the turn at which the deal tipped, the named signals that drove the score, and rule-based coaching.

⚠️ Methodology demonstration, not a production B2B model. Trained on consumer (C2C) Craigslist price-haggling; treat numbers as a demonstration of the method, not certified B2B performance.

Architecture

A hierarchical, conditionally-modulated pipeline:

each turn ──► RoBERTa (frozen, fine-tuned) ──► CLS [768]
                                                  │  sequence
            Temporal Transformer (2L, 8 heads) ◄──┘ ──► mean-pool ──► e_conv [768]
                                                                          │
LIWC features (90-d) ──► Conditioning MLP ──► γ, β [768] ──► e_fused = γ ⊙ e_conv + β
                                                                          │
                                              σ(Linear(768→1)) ──► LOSS PROBABILITY
                                              + SHAP seams (token · turn · feature) ──► coaching
  • Fusion = FiLM (Feature-wise Linear Modulation): low-dimensional linguistic/market signals modulate the conversation representation rather than competing with it as tokens.
  • External market line is identity / demo-only (external_mode="identity") — fetched & shown live but not used in the trained score, to avoid label leakage. (No learnable external params.)
  • Calibration: post-hoc Platt scaling (monotonic → AUC preserved) so the loss gauge is usable.

Results (held-out test, n = 703)

Model Test AUC-ROC
Flat RoBERTa (turn encoder) 0.804
Temporal transformer (val) 0.939
Full FiLM pipeline 0.899

F1 0.635 · Precision 0.753 · Recall 0.549 · Accuracy 0.88. Ablation flat→full = +0.095 AUC (genuine temporal + LIWC lift; external demo-only). Live, real-dialogue discrimination AUC ≈ 0.85.

Files

models/roberta_turn_encoder/   # HF-native frozen fine-tuned encoder (AutoModel)
models/temporal_transformer.pt # 2-layer temporal transformer state_dict
models/film_head.pt            # FiLM conditioning + classifier state_dict
models/roberta_aux_head.pt     # aux head for token-SHAP (approximate)
models/liwc_scaler.joblib      # MinMaxScaler (train-fit) for LIWC features
models/calibration.json        # Platt {a,b}
models/metrics.json            # measured metrics
src/                           # modeling code to reconstruct the pipeline
inference.py                   # runnable example

Usage

pip install -r requirements.txt
python inference.py
from src.inference.live_scorer import LiveScorer
scorer = LiveScorer.build()
out = scorer.score([
    {"speaker": "buyer",  "text": "Is the charger still available?"},
    {"speaker": "seller", "text": "Yes, asking $10."},
    {"speaker": "buyer",  "text": "That is too expensive, $4 is my max."},
    {"speaker": "seller", "text": "I could maybe do $8."},
    {"speaker": "buyer",  "text": "No. $4 or I am done."},
], fetch_external=False)
print(out["loss_probability"], out["tipping_turn"], out["coaching"])

Intended use & limitations

  • Use: demonstrate real-time, explainable loss scoring + coaching on short price-negotiation dialogues; research / educational.
  • Out of scope: generic / non-negotiation chat is out-of-distribution and will read near 0%. Not validated for production B2B decisioning. Probabilities are calibrated on C2C data; expect recalibration on real B2B funnels.
  • Explainability ≠ accuracy: the SHAP seams explain a prediction; they don't change it.

Training data & attribution

Fine-tuned on the Craigslist Bargaining corpus (He et al., 2018). This repo redistributes only derived model weights, not the dataset. Please cite:

@inproceedings{he2018decoupling,
  title={Decoupling Strategy and Generation in Negotiation Dialogues},
  author={He, He and Chen, Derek and Balakrishnan, Anusha and Liang, Percy},
  booktitle={EMNLP}, year={2018}
}

roberta-base (MIT) · empath (MIT). License of this repo: MIT (see LICENSE).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train sumitnewold/veridian-lqis

Space using sumitnewold/veridian-lqis 1