Text Classification
Transformers
ONNX
Safetensors
English
distilbert
trading
intent-classification
lora
english
text-embeddings-inference
Instructions to use DoDataThings/distilbert-trade-decision-classifier-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use DoDataThings/distilbert-trade-decision-classifier-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="DoDataThings/distilbert-trade-decision-classifier-v1")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("DoDataThings/distilbert-trade-decision-classifier-v1") model = AutoModelForSequenceClassification.from_pretrained("DoDataThings/distilbert-trade-decision-classifier-v1") - Notebooks
- Google Colab
- Kaggle
File size: 9,070 Bytes
ae97f24 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 | ---
license: apache-2.0
language: en
library_name: transformers
base_model: distilbert-base-uncased
tags:
- text-classification
- trading
- intent-classification
- distilbert
- lora
- onnx
- english
pipeline_tag: text-classification
---
# distilbert-trade-decision-classifier-v1
DistilBERT fine-tuned with LoRA r=32 for classifying user replies to trading-agent proposals into one of six decision intents. Pairs with a regex fast-path and a confirmation prompt for the bookends of a reply-routing pipeline.
## How it works
Trading agents that DM proposals ("Approve / decline / hold / size N / trim N?") get free-form text replies back. This model converts the reply into one of six discrete intents so the agent can route it deterministically.
The model is invoked AFTER a fast-path regex tries the canonical phrases first ("approve", "decline", "size 10"). The regex handles routine replies; the model handles everything the regex doesn't match.
```
Reply text in
β
Canonical-phrase regex β catches structured replies cheaply
β (no match)
THIS MODEL β classifies into 6 intent labels
β
Decision rule:
β’ confidence β₯ 0.85 AND label β UNCLEAR β commit
β’ else β confirmation prompt to the user
```
## Labels (6)
| Label | What it covers |
| -------------- | ----------------------------------------------------------------------- |
| APPROVE | Execute the proposal as stated. "approve", "yes", "let's go", "send it" |
| DECLINE | Kill the proposal. "no", "pass", "kill it", "hard pass" |
| HOLD | Active deferral β user is engaged but not deciding yet. "hold off", "checking", "let me think", "leaning approve" |
| COUNTER_SIZE | Execute but at a different share count. "size 10", "dump half", "trim 50" |
| COUNTER_PRICE | Execute but at a different limit price. "at $49", "limit 50", "trim at $48" |
| UNCLEAR | Cannot safely commit. Multi-intent, ambiguous, off-topic, or sarcastic. Falls through to confirmation prompt. |
UNCLEAR is a trained refusal label, not a fallback. The model is expected to emit it on multi-intent, ambiguous, or off-topic inputs. Treat it as the model saying "I don't know, ask the human."
## Inputs
A single string with structural context tags prepended:
```
[dm|group][reply_to:N|no_reply_to][in_flight:K] <reply text>
```
- `[dm]` vs `[group]` β chat surface (DM vs group chat)
- `[reply_to:N]` vs `[no_reply_to]` β whether the user quote-replied to a specific proposal
- `[in_flight:K]` β number of proposals currently awaiting decision
Example inputs:
```
[dm][reply_to:200][in_flight:1] approve
[dm][no_reply_to][in_flight:1] dump half
[dm][reply_to:200][in_flight:2] trim at $49
```
The tags carry context the model can't infer from the text alone β "yes" with 1 proposal in flight is APPROVE; "yes" with 3 in flight and no quote-reply is structurally ambiguous and trained as UNCLEAR.
## Usage
### Python (transformers)
```python
from transformers import pipeline
clf = pipeline(
"text-classification",
model="DoDataThings/distilbert-trade-decision-classifier-v1",
)
result = clf("[dm][reply_to:200][in_flight:1] dump half")
print(result)
# [{'label': 'COUNTER_SIZE', 'score': 0.991}]
```
### Python (onnxruntime, CPU)
```python
import onnxruntime as ort
import numpy as np
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("DoDataThings/distilbert-trade-decision-classifier-v1")
sess = ort.InferenceSession("model.onnx", providers=["CPUExecutionProvider"])
text = "[dm][no_reply_to][in_flight:1] hold off"
enc = tok(text, truncation=True, max_length=64, return_tensors="np")
logits = sess.run(
None,
{"input_ids": enc["input_ids"], "attention_mask": enc["attention_mask"]},
)[0][0]
probs = np.exp(logits) / np.exp(logits).sum()
labels = ["APPROVE", "DECLINE", "HOLD", "COUNTER_SIZE", "COUNTER_PRICE", "UNCLEAR"]
print(labels[int(probs.argmax())], float(probs.max()))
# HOLD 0.943
```
## Deployment shape
The model is not safe to use standalone. Pair with:
- A confidence threshold (we recommend 0.85)
- Deterministic safety rails (position size, available cash, mode gate)
- A confirmation prompt for low-confidence cases
The model picks intent; the system decides whether to act. It does not have final authority over orders.
## Design decisions
**Narrow-waist split.** The model classifies INTENT only, not proposal context. By design, upstream code disambiguates which proposal the reply targets (via quote-reply or single-default rule), and the model only sees the locked-in case. This makes the model independent of ticker / setup / portfolio specifics β its job is interpreting "what did the user mean," not "which one."
**UNCLEAR as a trained refusal class.** A 5-label classifier forced to pick one of {APPROVE, DECLINE, HOLD, COUNTER_SIZE} on ambiguous input is dangerous. The 6th label is the model's escape valve β it's trained on multi-intent, ambiguous, off-topic, and sarcastic inputs so it can refuse rather than guess. Combined with the 0.85 confidence threshold, this caps the blast radius of misclassification: an unsafe input either yields UNCLEAR (refusal) or a non-UNCLEAR label with low confidence (falls through to confirmation prompt).
**Structural prefix as text, not special tokens.** The `[dm][reply_to:N][in_flight:K]` tags are concatenated into the input string and tokenized as regular subword pieces. This works with off-the-shelf DistilBERT β no special-token registration, no tokenizer config drift between train and serve. The model learns the bracket conventions naturally via attention.
**Six labels including COUNTER_PRICE.** Earlier versions used five labels. The sixth (COUNTER_PRICE) was added because "trim at $49 instead of $48" is a fundamentally different action from "size 10" β different downstream extraction (price vs share count). Conflating them would force the consumer to disambiguate post-classification, defeating the purpose of the intent label.
## Evaluation
Held-out eval set: 175 hand-curated adversarial examples, ~30 per class, zero-leakage verified against training.
| Label | Precision | Recall | F1 | Count |
| -------------- | --------- | ------ | ----- | ----- |
| APPROVE | 0.967 | 0.967 | 0.967 | 30 |
| DECLINE | 1.000 | 0.933 | 0.966 | 30 |
| HOLD | 0.970 | 0.941 | 0.955 | 34 |
| COUNTER_SIZE | 0.968 | 1.000 | 0.984 | 30 |
| COUNTER_PRICE | 1.000 | 1.000 | 1.000 | 25 |
| UNCLEAR | 0.821 | 0.885 | 0.852 | 26 |
| **macro avg** | | | **0.954** | 175 |
| **accuracy** | | | **0.954** | |
**Honest assessment.** Zero high-confidence misclassifications on eval (no row labeled wrong at confidence β₯ 0.85). DECLINE and COUNTER_PRICE both hit perfect precision (1.000). UNCLEAR is the weakest class at F1 0.85, and the HOLD/UNCLEAR boundary on multi-intent inputs ("approve but only half") is genuinely fuzzy β these cases can be reasonably labeled either way. The 0.85 confidence threshold is calibrated so weak cases fall to confirmation rather than commit wrong.
## Training
| Knob | Value |
| ------------------ | ------------------------------------------- |
| Base model | distilbert-base-uncased |
| Adapter | LoRA r=32 on attention projections (q_lin, v_lin) |
| Sequence length | 64 |
| Batch size | 32 |
| Learning rate | 5e-5, cosine schedule, 10% warmup |
| Epochs | 3, early-stop on eval macro-F1 |
| Class weighting | inverse-frequency (functionally uniform β data is balanced within 2%) |
| Hardware | Single RTX 4090 |
| Wall time | ~9 seconds |
## Limitations
1. Classifies INTENT only, not proposal context. The model never sees the actual proposal being responded to β upstream proposal-disambiguation must run before this model is invoked.
2. COUNTER_SIZE emits intent only; share count extraction is a separate downstream step (regex).
3. COUNTER_PRICE emits intent only; price extraction is a separate downstream step.
4. Trained on author-curated and synthetically-augmented data. Real-world reply variety may exceed training surface forms; expect ~5% of replies to fall to confirmation-prompt fallback.
5. UNCLEAR has the lowest F1 (0.85). The boundary with HOLD (active deferral vs no-position) is fuzzy on multi-intent inputs.
6. English-only. No localization in v1.
## Dataset
Training and evaluation data: [DoDataThings/trade-decision-classifier-v1-dataset](https://huggingface.co/datasets/DoDataThings/trade-decision-classifier-v1-dataset)
## License
Apache 2.0.
|