Update README.md
Browse files
README.md
CHANGED
|
@@ -91,6 +91,12 @@ The model was trained using **Knowledge Distillation**. The student (`TinyBERT`)
|
|
| 91 |
| **Teacher** (XLM-R) | 278M | ~1 GB | ~360 samples/sec | 100% (Baseline) |
|
| 92 |
| **Student** (TinyBERT) | **11M** | **42 MB** | **~3,300 samples/sec** | **96%** |
|
| 93 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
## Usage
|
| 95 |
|
| 96 |
This model outputs **Multi-Label Probabilities** for the 4 engines. We recommend a threshold of **0.5** to trigger a route.
|
|
|
|
| 91 |
| **Teacher** (XLM-R) | 278M | ~1 GB | ~360 samples/sec | 100% (Baseline) |
|
| 92 |
| **Student** (TinyBERT) | **11M** | **42 MB** | **~3,300 samples/sec** | **96%** |
|
| 93 |
|
| 94 |
+
## Evaluation Results
|
| 95 |
+
|
| 96 |
+
The model shows strong separation between the routing categories, with minimal confusion between semantically distinct classes (e.g., Address vs. Regex).
|
| 97 |
+
|
| 98 |
+

|
| 99 |
+
|
| 100 |
## Usage
|
| 101 |
|
| 102 |
This model outputs **Multi-Label Probabilities** for the 4 engines. We recommend a threshold of **0.5** to trigger a route.
|