dejanseo commited on
Commit
9b29aef
·
verified ·
1 Parent(s): e423891

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -19
README.md CHANGED
@@ -20,40 +20,50 @@ tags:
20
 
21
  [![Dejan AI Logo](https://dejan.ai/wp-content/uploads/2024/02/dejan.png)](https://dejan.ai/blog/grounding-classifier/)
22
 
23
- ## Prompt Grounding Classifier — DeBERTa v3 Large (Fine-Tuned)
24
 
25
- This model predicts whether a natural language prompt **requires grounding** in external sources such as search, database, or retrieval-augmented generation (RAG).
26
 
27
- It was fine-tuned from [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) using a binary label format (`1 = requires grounding`, `0 = self-contained`).
28
 
29
- ### Why this matters
 
30
 
31
- Routing decisions matter. This classifier acts as a gatekeeper for LLM pipelines by predicting whether a prompt should trigger external retrieval. It optimizes performance, reduces latency, and avoids unnecessary API calls.
 
 
 
 
 
 
 
 
32
 
33
  ---
34
 
35
- ## Model Details
36
 
37
- - 🧠 **Architecture**: DeBERTa v3 Large
38
- - ⚙️ **Training**: Full fine-tuning (no PEFT)
39
- - 🧪 **Batch size**: 24 (with accumulation)
40
- - 🔁 **Scheduler**: Cosine learning rate decay with warmup
41
- - 📉 **Dropout adjusted**: 0.1 for attention and hidden layers
42
- - 📦 **Final checkpoint size**: ~1.7 GB
 
43
 
44
  ---
45
 
46
- ## Example Predictions
47
 
48
- | Prompt | Grounding | Confidence |
49
- |------------------------------------------------------|-----------|------------|
50
- | What’s the exchange rate for USD to Yen right now? | 1 | 0.999 |
51
- | Tell me a bedtime story about a robot and a dragon. | 0 | 0.996 |
52
- | Who is the current CEO of Microsoft? | 1 | 0.998 |
53
 
54
  ---
55
 
56
- ## How to Use
57
 
58
  ```python
59
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
@@ -68,3 +78,13 @@ outputs = model(**inputs).logits
68
  probs = F.softmax(outputs, dim=-1)
69
  label = probs.argmax().item()
70
  confidence = probs[0][label].item()
 
 
 
 
 
 
 
 
 
 
 
20
 
21
  [![Dejan AI Logo](https://dejan.ai/wp-content/uploads/2024/02/dejan.png)](https://dejan.ai/blog/grounding-classifier/)
22
 
23
+ # Prompt Grounding Classifier — DeBERTa v3 Large (Fine-Tuned)
24
 
25
+ This model predicts whether a prompt **requires grounding** in external sources like web search, databases, or RAG pipelines.
26
 
27
+ It was fine-tuned from [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) using binary labels:
28
 
29
+ - `1` = grounding required
30
+ - `0` = self-contained prompt
31
 
32
+ ---
33
+
34
+ ## 🚀 Use Case
35
+
36
+ This classifier acts as a **routing layer** in an LLM pipeline, helping decide:
37
+
38
+ - When to trigger retrieval
39
+ - When to let the model respond from internal knowledge
40
+ - How to optimize for latency and cost
41
 
42
  ---
43
 
44
+ ## 📦 Training Details
45
 
46
+ - Model: DeBERTa v3 Large
47
+ - Fine-tuning: Full (no adapters)
48
+ - Dropout: 0.1
49
+ - Scheduler: Cosine with warmup
50
+ - Batch size: 24 (accumulated)
51
+ - Evaluation: every 500 steps
52
+ - Metric used for best checkpoint: F1
53
 
54
  ---
55
 
56
+ ## 🧪 Example Predictions
57
 
58
+ | Prompt | Grounding | Confidence |
59
+ |---------------------------------------------------------|-----------|------------|
60
+ | What’s the exchange rate for USD to Yen right now? | 1 | 0.999 |
61
+ | Tell me a bedtime story about a robot and a dragon. | 0 | 0.9961 |
62
+ | Who is the current CEO of Microsoft? | 1 | 0.9986 |
63
 
64
  ---
65
 
66
+ ## 🧠 How to Use
67
 
68
  ```python
69
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
 
78
  probs = F.softmax(outputs, dim=-1)
79
  label = probs.argmax().item()
80
  confidence = probs[0][label].item()
81
+ ```
82
+
83
+ ---
84
+
85
+ ## 🧾 Dataset Origin
86
+
87
+ Prompts were collected using a Gemini 2.5 Pro + Google Search toolchain with grounding enabled. Each prompt's response was parsed to extract Gemini's grounding confidence, used as soft supervision for binary labeling:
88
+
89
+ - Label 1 if grounded confidence present
90
+ - Label 0 if response required no external evidence