manikumargouni commited on
Commit
b917e7b
·
verified ·
1 Parent(s): 0bd8c07

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +182 -462
README.md CHANGED
@@ -1,10 +1,57 @@
1
- # Agentic Intent Classifier
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- `agentic-intent-classifier` is a multi-head query classification stack for conversational traffic.
4
 
5
- ## Quickstart (recommended): run from Hugging Face Hub
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
- This is the easiest way for developers to test the full production stack (multitask intent + IAB + calibration) without training locally.
8
 
9
  ```python
10
  from transformers import pipeline
@@ -12,539 +59,212 @@ from transformers import pipeline
12
  clf = pipeline(
13
  "admesh-intent",
14
  model="admesh/agentic-intent-classifier",
15
- trust_remote_code=True,
16
  )
17
 
18
  out = clf("Which laptop should I buy for college?")
19
- print(out["model_output"]["classification"]["intent"])
20
- print(out["model_output"]["classification"]["iab_content"])
21
  print(out["meta"])
 
22
  ```
23
 
24
- If you’re running in Colab/Kaggle and see Torch version conflicts, follow `COLAB_SETUP.md`.
25
-
26
- ## Latency / inference timing (developer quick check)
27
 
28
- The first call includes model/code loading; measure latency after a warm-up call.
29
 
30
- Single query:
31
 
32
  ```python
33
  import time
34
- from transformers import pipeline
35
-
36
- clf = pipeline("admesh-intent", model="admesh/agentic-intent-classifier", trust_remote_code=True)
37
  q = "Which laptop should I buy for college?"
38
 
39
  _ = clf("warm up")
40
  t0 = time.perf_counter()
41
  out = clf(q)
42
- dt_ms = (time.perf_counter() - t0) * 1000
43
-
44
- print(f"latency_ms={dt_ms:.1f}")
45
- print(out["model_output"]["classification"]["intent"])
46
  ```
47
 
48
- Warm p50 / p95 over 20 runs:
49
 
50
  ```python
51
- import time, statistics
52
-
53
- times = []
54
- for _ in range(20):
55
- t0 = time.perf_counter()
56
- _ = clf(q)
57
- times.append((time.perf_counter() - t0) * 1000)
58
-
59
- times_sorted = sorted(times)
60
- print(f"p50={statistics.median(times):.1f}ms p95={times_sorted[int(0.95*len(times))-1]:.1f}ms mean={statistics.mean(times):.1f}ms")
61
- ```
62
-
63
- It currently produces:
64
-
65
- - `intent.type`
66
- - `intent.subtype`
67
- - `intent.decision_phase`
68
- - `iab_content`
69
- - calibrated confidence per head
70
- - combined fallback / policy / opportunity decisions
71
-
72
- The repo is beyond the original v0.1 baseline. It now includes:
73
-
74
- - shared config and label ownership
75
- - reusable model runtime
76
- - calibrated confidence and threshold gating
77
- - combined inference with fallback/policy logic
78
- - request/response validation in the demo API
79
- - repeatable evaluation and regression suites
80
- - full-TSV IAB taxonomy retrieval support through tier4
81
- - a local embedding index for taxonomy-node retrieval over IAB content paths
82
- - a separate synthetic full-intent-taxonomy augmentation dataset for non-IAB heads
83
- - a dedicated intent-type difficulty dataset and held-out benchmark with `easy`, `medium`, and `hard` cases
84
- - a dedicated decision-phase difficulty dataset and held-out benchmark with `easy`, `medium`, and `hard` cases
85
-
86
- Generated model weights are intentionally not committed.
87
-
88
- ## Current Taxonomy
89
-
90
- ### `intent.type`
91
-
92
- - `informational`
93
- - `exploratory`
94
- - `commercial`
95
- - `transactional`
96
- - `support`
97
- - `personal_reflection`
98
- - `creative_generation`
99
- - `chit_chat`
100
- - `ambiguous`
101
- - `prohibited`
102
-
103
- ### `intent.decision_phase`
104
-
105
- - `awareness`
106
- - `research`
107
- - `consideration`
108
- - `decision`
109
- - `action`
110
- - `post_purchase`
111
- - `support`
112
-
113
- ### `intent.subtype`
114
-
115
- - `education`
116
- - `product_discovery`
117
- - `comparison`
118
- - `evaluation`
119
- - `deal_seeking`
120
- - `provider_selection`
121
- - `signup`
122
- - `purchase`
123
- - `booking`
124
- - `download`
125
- - `contact_sales`
126
- - `task_execution`
127
- - `onboarding_setup`
128
- - `troubleshooting`
129
- - `account_help`
130
- - `billing_help`
131
- - `follow_up`
132
- - `emotional_reflection`
133
-
134
- ### `iab_content`
135
-
136
- - candidates are derived from every row in [data/iab-content/Content Taxonomy 3.0.tsv](data/iab-content/Content%20Taxonomy%203.0.tsv)
137
- - retrieval output supports `tier1`, `tier2`, `tier3`, and optional `tier4`
138
-
139
- ## What The System Does
140
-
141
- - runs three classifier heads:
142
- - `intent_type`
143
- - `intent_subtype`
144
- - `decision_phase`
145
- - resolves `iab_content` through a local embedding index over taxonomy nodes plus generic label/path reranking
146
- - applies calibration artifacts when present
147
- - computes `commercial_score`
148
- - applies fallback when confidence is too weak or policy-safe blocking is required
149
- - emits a schema-validated combined envelope
150
-
151
- ## What The System Does Not Do
152
-
153
- - it is not a multi-turn memory system
154
- - it is not a production-optimized low-latency serving path
155
- - it is not yet trained on large real-traffic human-labeled intent data
156
- - combined decision logic is still heuristic, even though it is materially stronger than the original baseline
157
-
158
- ## Project Layout
159
-
160
- - [config.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/config.py): labels, thresholds, artifact paths, model paths
161
- - [model_runtime.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/model_runtime.py): shared calibrated inference runtime
162
- - [combined_inference.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/combined_inference.py): composed system response
163
- - [inference_intent_type.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/inference_intent_type.py): direct `intent_type` inference entrypoint
164
- - [inference_iab_classifier.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/inference_iab_classifier.py): direct supervised `iab_content` inference entrypoint
165
- - [schemas.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/schemas.py): request/response validation
166
- - [demo_api.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/demo_api.py): local validated API
167
- - [iab_taxonomy.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/iab_taxonomy.py): full taxonomy parser/index
168
- - [iab_classifier.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/iab_classifier.py): supervised IAB runtime with taxonomy-aware parent fallback
169
- - [iab_retrieval.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/iab_retrieval.py): optional shadow retrieval baseline
170
- - [training/build_full_intent_taxonomy_dataset.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/training/build_full_intent_taxonomy_dataset.py): separate synthetic intent augmentation dataset
171
- - [training/build_intent_type_difficulty_dataset.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/training/build_intent_type_difficulty_dataset.py): extra `intent_type` augmentation plus held-out difficulty benchmark
172
- - [training/build_decision_phase_difficulty_dataset.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/training/build_decision_phase_difficulty_dataset.py): extra `decision_phase` augmentation plus held-out difficulty benchmark
173
- - [training/build_subtype_difficulty_dataset.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/training/build_subtype_difficulty_dataset.py): extra `intent_subtype` augmentation plus held-out difficulty benchmark
174
- - [training/build_subtype_dataset.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/training/build_subtype_dataset.py): subtype dataset generation from existing corpora
175
- - [training/train_iab.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/training/train_iab.py): train the supervised IAB classifier head
176
- - [training/build_iab_taxonomy_embeddings.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/training/build_iab_taxonomy_embeddings.py): build local IAB node embedding artifacts
177
- - [training/run_full_training_pipeline.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/training/run_full_training_pipeline.py): full multi-head training/calibration/eval pipeline
178
- - [evaluation/run_evaluation.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/evaluation/run_evaluation.py): repeatable benchmark runner
179
- - [evaluation/run_regression_suite.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/evaluation/run_regression_suite.py): known-failure regression runner
180
- - [evaluation/run_iab_mapping_suite.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/evaluation/run_iab_mapping_suite.py): IAB behavior-lock regression runner
181
- - [evaluation/run_iab_quality_suite.py](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/evaluation/run_iab_quality_suite.py): curated IAB quality-target runner
182
- - [known_limitations.md](/Users/manikumargouni/Desktop/AdMesh/protocol/agentic-intent-classifier/known_limitations.md): current gaps and caveats
183
-
184
- ## Quickstart: Run From Hugging Face
185
-
186
- Download the trained bundle and run inference in three lines — no local training required.
187
-
188
- ```python
189
- import sys
190
- from huggingface_hub import snapshot_download
191
 
192
- # Download the full bundle (models + calibration + code)
193
- local_dir = snapshot_download(
194
- repo_id="admesh/agentic-intent-classifier",
195
- repo_type="model",
196
  )
197
- sys.path.insert(0, local_dir)
198
-
199
- # Import and instantiate
200
- from pipeline import AdmeshIntentPipeline
201
- clf = AdmeshIntentPipeline()
202
 
203
- # Classify
204
- import json
205
  result = clf("Which laptop should I buy for college?")
206
- print(json.dumps(result, indent=2))
207
- ```
208
-
209
- Or use the one-liner factory method:
210
-
211
- ```python
212
- from pipeline import AdmeshIntentPipeline # after sys.path.insert above
213
-
214
- clf = AdmeshIntentPipeline.from_pretrained("admesh/agentic-intent-classifier")
215
- result = clf("I need a CRM for a 5-person startup")
216
  ```
217
 
218
- Batch mode and custom thresholds are also supported:
219
 
220
  ```python
221
- # Batch
222
  results = clf([
223
  "Best running shoes under $100",
224
- "How does gradient descent work?",
225
  "Buy noise-cancelling headphones",
226
  ])
227
 
228
- # Custom confidence thresholds
229
  result = clf(
230
- "Buy noise-cancelling headphones",
231
  threshold_overrides={"intent_type": 0.6, "intent_subtype": 0.35},
232
  )
233
  ```
234
 
235
- Verify artifacts and run a smoke test from the CLI:
236
-
237
- ```bash
238
- cd "<local_dir>"
239
- python3 training/pipeline_verify.py
240
- python3 combined_inference.py "Which CRM should I buy for a 3-person startup?"
241
- ```
242
-
243
- Pin a specific revision for reproducibility:
244
-
245
- ```python
246
- local_dir = snapshot_download(
247
- repo_id="admesh/agentic-intent-classifier",
248
- repo_type="model",
249
- revision="0584798f8efee6beccd778b0afa06782ab5add60",
250
- )
251
- ```
252
-
253
  ---
254
 
255
- ## Setup (for local training)
256
-
257
- ```bash
258
- python3 -m venv .venv
259
- source .venv/bin/activate
260
- pip install -r agentic-intent-classifier/requirements.txt
261
- ```
262
-
263
- ## Inference (local training path)
264
-
265
- Run one query locally:
266
-
267
- ```bash
268
- cd agentic-intent-classifier
269
- python3 training/train_iab.py
270
- python3 training/calibrate_confidence.py --head iab_content
271
- python3 combined_inference.py "Which CRM should I buy for a 3-person startup?"
272
- ```
273
-
274
- Run only the `intent_type` head:
275
-
276
- ```bash
277
- cd agentic-intent-classifier
278
- python3 inference_intent_type.py "best shoes under 100"
279
- ```
280
-
281
- Run the demo API:
282
-
283
- ```bash
284
- cd agentic-intent-classifier
285
- python3 demo_api.py
286
- ```
287
 
288
- Example request:
289
-
290
- ```bash
291
- curl -sS -X POST http://127.0.0.1:8008/classify \
292
- -H 'Content-Type: application/json' \
293
- -d '{"text":"I cannot log into my account"}'
294
- ```
295
 
296
- Infra endpoints:
297
 
298
  ```bash
299
- curl -sS http://127.0.0.1:8008/health
300
- curl -sS http://127.0.0.1:8008/version
 
 
301
  ```
302
 
303
- Train only the IAB classifier head:
304
-
305
- ```bash
306
- cd agentic-intent-classifier
307
- python3 training/train_iab.py
308
- python3 training/calibrate_confidence.py --head iab_content
309
- ```
310
-
311
- The online `iab_content` path now uses the compact supervised classifier. Retrieval is still available as an optional shadow baseline.
312
-
313
- Build the optional retrieval shadow index:
314
-
315
- ```bash
316
- cd agentic-intent-classifier
317
- python3 training/build_iab_taxonomy_embeddings.py
318
- ```
319
-
320
- By default the shadow retrieval path uses `Alibaba-NLP/gte-Qwen2-1.5B-instruct`. The retrieval runtime applies the model's query-side instruction format and last-token pooling, matching the Hugging Face usage guidance. If you want to point retrieval at a different embedding model, set `IAB_RETRIEVAL_MODEL_NAME_OVERRIDE` before building the index.
321
-
322
- Open-source users can swap in their own embedding model, but the contract is:
323
-
324
- - query embeddings and taxonomy-node embeddings must be produced by the same model and model revision
325
- - after changing models, you must rebuild `artifacts/iab/taxonomy_embeddings.pt`
326
- - the repository only tests and supports the default model path out of the box
327
- - not every Hugging Face embedding model is drop-in compatible with this runtime; some require custom pooling, query instructions, or `trust_remote_code`
328
-
329
- Example override:
330
-
331
- ```bash
332
- cd agentic-intent-classifier
333
- export IAB_RETRIEVAL_MODEL_NAME_OVERRIDE=mixedbread-ai/mxbai-embed-large-v1
334
- python3 training/build_iab_taxonomy_embeddings.py
335
- ```
336
-
337
- This writes:
338
-
339
- - `artifacts/iab/taxonomy_nodes.json`
340
- - `artifacts/iab/taxonomy_embeddings.pt`
341
-
342
- ## Training
343
-
344
- ### Full local pipeline
345
-
346
- ```bash
347
- cd agentic-intent-classifier
348
- python3 training/run_full_training_pipeline.py
349
- ```
350
-
351
- This pipeline now does:
352
-
353
- 1. build separate full-intent-taxonomy augmentation data
354
- 2. build separate `intent_type` difficulty augmentation + benchmark
355
- 3. train `intent_type`
356
- 4. build subtype corpus
357
- 5. build separate `intent_subtype` difficulty augmentation + benchmark
358
- 6. train `intent_subtype`
359
- 7. build separate `decision_phase` difficulty augmentation + benchmark
360
- 8. train `decision_phase`
361
- 9. train `iab_content`
362
- 10. calibrate all classifier heads, including `iab_content`
363
- 11. run regression/evaluation unless `--skip-full-eval` is used
364
-
365
- ### Build datasets individually
366
-
367
- Separate full-intent augmentation:
368
-
369
- ```bash
370
- cd agentic-intent-classifier
371
- python3 training/build_full_intent_taxonomy_dataset.py
372
- ```
373
-
374
- Intent-type difficulty augmentation and benchmark:
375
-
376
- ```bash
377
- cd agentic-intent-classifier
378
- python3 training/build_intent_type_difficulty_dataset.py
379
- ```
380
-
381
- Decision-phase difficulty augmentation and benchmark:
382
-
383
- ```bash
384
- cd agentic-intent-classifier
385
- python3 training/build_decision_phase_difficulty_dataset.py
386
- ```
387
-
388
- Subtype difficulty augmentation and benchmark:
389
-
390
- ```bash
391
- cd agentic-intent-classifier
392
- python3 training/build_subtype_difficulty_dataset.py
393
- ```
394
-
395
- Subtype dataset:
396
 
397
- ```bash
398
- cd agentic-intent-classifier
399
- python3 training/build_subtype_dataset.py
400
- ```
401
 
402
- IAB embedding index:
 
 
 
403
 
404
- ```bash
405
- cd agentic-intent-classifier
406
- python3 training/build_iab_taxonomy_embeddings.py
407
- ```
 
408
 
409
- ### Train heads individually
 
410
 
411
- ```bash
412
- cd agentic-intent-classifier
413
- python3 training/train.py
414
- python3 training/train_subtype.py
415
- python3 training/train_decision_phase.py
416
  ```
417
 
418
- ### Calibration
419
 
420
- ```bash
421
- cd agentic-intent-classifier
422
- python3 training/calibrate_confidence.py --head intent_type
423
- python3 training/calibrate_confidence.py --head intent_subtype
424
- python3 training/calibrate_confidence.py --head decision_phase
425
- ```
426
 
427
- ## Evaluation
 
 
428
 
429
- Full evaluation:
 
 
 
 
430
 
431
- ```bash
432
- cd agentic-intent-classifier
433
- python3 evaluation/run_evaluation.py
434
  ```
435
 
436
- Known-failure regression:
437
 
438
- ```bash
439
- cd agentic-intent-classifier
440
- python3 evaluation/run_regression_suite.py
441
  ```
442
 
443
- IAB behavior-lock regression:
444
 
445
- ```bash
446
- cd agentic-intent-classifier
447
- python3 evaluation/run_iab_mapping_suite.py
448
- ```
449
 
450
- IAB quality-target evaluation:
451
 
452
- ```bash
453
- cd agentic-intent-classifier
454
- python3 evaluation/run_iab_quality_suite.py
455
- ```
456
 
457
- Threshold sweeps:
 
 
 
 
458
 
459
- ```bash
460
- cd agentic-intent-classifier
461
- python3 evaluation/sweep_intent_threshold.py
462
- ```
463
 
464
- Artifacts are written to:
465
 
 
 
466
  - `artifacts/calibration/`
467
- - `artifacts/evaluation/latest/`
468
-
469
- ## Google Colab
470
-
471
- Use Colab for the full retraining pass if local memory is limited.
472
-
473
- Clone once:
474
-
475
- ```bash
476
- %cd /content
477
- !git clone https://github.com/GouniManikumar12/agentic-intent-classifier.git
478
- %cd /content/agentic-intent-classifier
479
- ```
480
-
481
- If the repo is already cloned and you want the latest code, pull manually:
482
-
483
- ```bash
484
- !git pull origin main
485
- ```
486
-
487
- Full pipeline:
488
-
489
- ```bash
490
- !python training/run_full_training_pipeline.py
491
- ```
492
-
493
- If full evaluation is too heavy for the current Colab runtime:
494
-
495
- ```bash
496
- !python training/run_full_training_pipeline.py \
497
- --iab-embedding-batch-size 32 \
498
- --skip-full-eval
499
- ```
500
-
501
- Then run eval separately after training:
502
 
503
- ```bash
504
- !python evaluation/run_regression_suite.py
505
- !python evaluation/run_iab_mapping_suite.py
506
- !python evaluation/run_iab_quality_suite.py
507
- !python evaluation/run_evaluation.py
508
- ```
509
-
510
- ## Current Saved Metrics
511
 
512
- Generate fresh metrics with:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
513
 
514
- ```bash
515
- cd agentic-intent-classifier
516
- python3 evaluation/run_evaluation.py
 
 
 
 
517
  ```
518
 
519
- Do not treat any checked-in summary as canonical unless it was regenerated after the current code and artifacts were built. The IAB path is now retrieval-based, so older saved reports from the deleted hierarchy stack are not meaningful.
520
-
521
- ## Latency Note
522
-
523
- `combined_inference.py` is a debugging/offline path, not a production latency path.
524
-
525
- Current production truth:
526
-
527
- - per-request CLI execution is not a sub-50ms architecture
528
- - production serving should use a long-lived API process with preloaded models
529
- - if sub-50ms becomes a hard requirement, the serving path will need:
530
- - persistent loaded models
531
- - runtime optimization
532
- - likely fewer model passes or a shared multi-head model
533
-
534
- ## Current Status
535
-
536
- Current repo status:
537
 
538
- - full 10-class `intent.type` taxonomy is wired
539
- - subtype and phase heads are present
540
- - difficulty benchmarks are wired for `intent_type`, `intent_subtype`, and `decision_phase`
541
- - full-TSV IAB taxonomy retrieval is wired through tier4
542
- - separate full-intent augmentation dataset is in place
543
- - evaluation/runtime memory handling is improved for large IAB splits
 
544
 
545
- The main remaining gap is not basic infrastructure anymore. It is improving real-world robustness, especially for:
546
 
547
- - `decision_phase`
548
- - `intent_subtype`
549
- - confidence quality on borderline commercial queries
550
- - real-traffic supervision beyond synthetic data
 
1
+ ---
2
+ language:
3
+ - en
4
+ library_name: transformers
5
+ pipeline_tag: text-classification
6
+ base_model: distilbert-base-uncased
7
+ metrics:
8
+ - accuracy
9
+ - f1
10
+ tags:
11
+ - intent-classification
12
+ - multitask
13
+ - iab
14
+ - conversational-ai
15
+ - adtech
16
+ - calibrated-confidence
17
+ license: apache-2.0
18
+ ---
19
+
20
+ # admesh/agentic-intent-classifier
21
+
22
+ Production-ready intent + IAB classifier bundle for conversational traffic.
23
+
24
+ Combines multitask intent modeling, supervised IAB content classification, and per-head confidence calibration to support safe monetization decisions in real time.
25
+
26
+ ## Links
27
+
28
+ - Hugging Face: https://huggingface.co/admesh/agentic-intent-classifier
29
+ - GitHub: https://github.com/GouniManikumar12/agentic-intent-classifier
30
 
31
+ ## What It Predicts
32
 
33
+ | Field | Description |
34
+ |---|---|
35
+ | `intent.type` | `commercial`, `informational`, `navigational`, `transactional`, … |
36
+ | `intent.subtype` | `product_discovery`, `comparison`, `how_to`, … |
37
+ | `intent.decision_phase` | `awareness`, `consideration`, `decision`, … |
38
+ | `iab_content` | IAB Content Taxonomy 3.0 tier1 / tier2 / tier3 labels |
39
+ | `component_confidence` | Per-head calibrated confidence with threshold flags |
40
+ | `system_decision` | Monetization eligibility, opportunity type, policy |
41
+
42
+ ---
43
+
44
+ ## Deployment Options
45
+
46
+ ### 0. Colab / Kaggle Quickstart (copy/paste)
47
+
48
+ ```python
49
+ !pip -q install -U pip
50
+ !pip -q install -U "torch==2.10.0" "torchvision==0.25.0" "torchaudio==2.10.0"
51
+ !pip -q install -U "transformers>=4.36.0" "huggingface_hub>=0.20.0" "safetensors>=0.4.0"
52
+ ```
53
 
54
+ Restart the runtime after installs (**Runtime Restart runtime**) so the new Torch version is actually used.
55
 
56
  ```python
57
  from transformers import pipeline
 
59
  clf = pipeline(
60
  "admesh-intent",
61
  model="admesh/agentic-intent-classifier",
62
+ trust_remote_code=True, # required (custom pipeline + multi-model bundle)
63
  )
64
 
65
  out = clf("Which laptop should I buy for college?")
 
 
66
  print(out["meta"])
67
+ print(out["model_output"]["classification"]["intent"])
68
  ```
69
 
70
+ ---
 
 
71
 
72
+ ## Latency / inference timing (quick check)
73
 
74
+ The first call includes model/code loading. Warm up once, then measure:
75
 
76
  ```python
77
  import time
 
 
 
78
  q = "Which laptop should I buy for college?"
79
 
80
  _ = clf("warm up")
81
  t0 = time.perf_counter()
82
  out = clf(q)
83
+ print(f"latency_ms={(time.perf_counter() - t0) * 1000:.1f}")
 
 
 
84
  ```
85
 
86
+ ### 1. `transformers.pipeline()` anywhere (Python)
87
 
88
  ```python
89
+ from transformers import pipeline
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
 
91
+ clf = pipeline(
92
+ "admesh-intent",
93
+ model="admesh/agentic-intent-classifier",
94
+ trust_remote_code=True,
95
  )
 
 
 
 
 
96
 
 
 
97
  result = clf("Which laptop should I buy for college?")
 
 
 
 
 
 
 
 
 
 
98
  ```
99
 
100
+ Batch and custom thresholds:
101
 
102
  ```python
103
+ # batch
104
  results = clf([
105
  "Best running shoes under $100",
106
+ "How does TCP work?",
107
  "Buy noise-cancelling headphones",
108
  ])
109
 
110
+ # custom confidence thresholds
111
  result = clf(
112
+ "Buy headphones",
113
  threshold_overrides={"intent_type": 0.6, "intent_subtype": 0.35},
114
  )
115
  ```
116
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
117
  ---
118
 
119
+ ### 2. HF Inference Endpoints (managed, deploy to AWS / Azure / GCP)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
120
 
121
+ 1. Go to https://ui.endpoints.huggingface.co
122
+ 2. **New Endpoint** → select `admesh/agentic-intent-classifier`
123
+ 3. Framework: **PyTorch** — Task: **Text Classification**
124
+ 4. Enable **"Load with trust_remote_code"**
125
+ 5. Deploy
 
 
126
 
127
+ The endpoint serves the same `pipeline()` interface above via REST:
128
 
129
  ```bash
130
+ curl https://<your-endpoint>.endpoints.huggingface.cloud \
131
+ -H "Authorization: Bearer $HF_TOKEN" \
132
+ -H "Content-Type: application/json" \
133
+ -d '{"inputs": "Which laptop should I buy for college?"}'
134
  ```
135
 
136
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
137
 
138
+ ### 3. HF Spaces (Gradio / Streamlit demo)
 
 
 
139
 
140
+ ```python
141
+ # app.py for a Gradio Space
142
+ import gradio as gr
143
+ from transformers import pipeline
144
 
145
+ clf = pipeline(
146
+ "admesh-intent",
147
+ model="admesh/agentic-intent-classifier",
148
+ trust_remote_code=True,
149
+ )
150
 
151
+ def classify(text):
152
+ return clf(text)
153
 
154
+ gr.Interface(fn=classify, inputs="text", outputs="json").launch()
 
 
 
 
155
  ```
156
 
157
+ ---
158
 
159
+ ### 4. Local / notebook via `snapshot_download`
 
 
 
 
 
160
 
161
+ ```python
162
+ import sys
163
+ from huggingface_hub import snapshot_download
164
 
165
+ local_dir = snapshot_download(
166
+ repo_id="admesh/agentic-intent-classifier",
167
+ repo_type="model",
168
+ )
169
+ sys.path.insert(0, local_dir)
170
 
171
+ from pipeline import AdmeshIntentPipeline
172
+ clf = AdmeshIntentPipeline()
173
+ result = clf("I need a CRM for a 5-person startup")
174
  ```
175
 
176
+ Or the one-liner factory:
177
 
178
+ ```python
179
+ from pipeline import AdmeshIntentPipeline
180
+ clf = AdmeshIntentPipeline.from_pretrained("admesh/agentic-intent-classifier")
181
  ```
182
 
183
+ ---
184
 
185
+ ## Troubleshooting (avoid environment errors)
 
 
 
186
 
187
+ ### `No module named 'combined_inference'` (or similar)
188
 
189
+ This means the Hub repo root is missing required Python files. Ensure these exist at the **root of the model repo** (same level as `pipeline.py`):
 
 
 
190
 
191
+ - `pipeline.py`, `config.json`, `config.py`
192
+ - `combined_inference.py`, `schemas.py`
193
+ - `model_runtime.py`, `multitask_runtime.py`, `multitask_model.py`
194
+ - `inference_intent_type.py`, `inference_subtype.py`, `inference_decision_phase.py`, `inference_iab_classifier.py`
195
+ - `iab_classifier.py`, `iab_taxonomy.py`
196
 
197
+ ### `does not appear to have a file named model.safetensors`
 
 
 
198
 
199
+ Transformers requires a standard checkpoint at the repo root for `pipeline()` to initialize. This repo includes a **small dummy** `model.safetensors` + tokenizer files at the root for compatibility; the *real* production weights live in:
200
 
201
+ - `multitask_intent_model_output/`
202
+ - `iab_classifier_model_output/`
203
  - `artifacts/calibration/`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
204
 
205
+ ---
 
 
 
 
 
 
 
206
 
207
+ ## Example Output
208
+
209
+ ```json
210
+ {
211
+ "model_output": {
212
+ "classification": {
213
+ "iab_content": {
214
+ "taxonomy": "IAB Content Taxonomy",
215
+ "taxonomy_version": "3.0",
216
+ "tier1": {"id": "552", "label": "Style & Fashion"},
217
+ "tier2": {"id": "579", "label": "Men's Fashion"},
218
+ "mapping_mode": "exact",
219
+ "mapping_confidence": 0.73
220
+ },
221
+ "intent": {
222
+ "type": "commercial",
223
+ "subtype": "product_discovery",
224
+ "decision_phase": "consideration",
225
+ "confidence": 0.9549,
226
+ "commercial_score": 0.656
227
+ }
228
+ }
229
+ },
230
+ "system_decision": {
231
+ "policy": {
232
+ "monetization_eligibility": "allowed_with_caution",
233
+ "eligibility_reason": "commercial_discovery_signal_present"
234
+ },
235
+ "opportunity": {"type": "soft_recommendation", "strength": "medium"}
236
+ },
237
+ "meta": {
238
+ "system_version": "0.6.0-phase4",
239
+ "calibration_enabled": true,
240
+ "iab_mapping_is_placeholder": false
241
+ }
242
+ }
243
+ ```
244
+
245
+ ## Reproducible Revision
246
 
247
+ ```python
248
+ from huggingface_hub import snapshot_download
249
+ local_dir = snapshot_download(
250
+ repo_id="admesh/agentic-intent-classifier",
251
+ repo_type="model",
252
+ revision="0584798f8efee6beccd778b0afa06782ab5add60",
253
+ )
254
  ```
255
 
256
+ ## Included Artifacts
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
257
 
258
+ | Path | Contents |
259
+ |---|---|
260
+ | `multitask_intent_model_output/` | DistilBERT multitask weights + tokenizer |
261
+ | `iab_classifier_model_output/` | IAB content classifier weights + tokenizer |
262
+ | `artifacts/calibration/` | Per-head temperature + threshold JSONs |
263
+ | `pipeline.py` | `AdmeshIntentPipeline` (transformers.Pipeline subclass) |
264
+ | `combined_inference.py` | Core inference logic |
265
 
266
+ ## Notes
267
 
268
+ - `trust_remote_code=True` is required because this model uses a custom multi-head architecture that does not map to a single standard `AutoModel` checkpoint.
269
+ - `meta.iab_mapping_is_placeholder: true` means IAB artifacts were missing or skipped; train and calibrate IAB for full production accuracy.
270
+ - For long-running servers, instantiate once and reuse — models are cached in memory after the first call.