Update README.md
Browse files
README.md
CHANGED
|
@@ -36,7 +36,6 @@ A lightweight, multilingual DistilBERT model fine-tuned for End-of-Utterance (EO
|
|
| 36 |
- **F1 Score**: 0.9150
|
| 37 |
- **Precision**: 0.9796
|
| 38 |
- **Recall**: 0.8584
|
| 39 |
-
- **Optimal Threshold**: 0.86
|
| 40 |
|
| 41 |
## Model Variants
|
| 42 |
|
|
@@ -112,7 +111,7 @@ text = "Thanks for your help!"
|
|
| 112 |
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)
|
| 113 |
outputs = model(**inputs)
|
| 114 |
probs = torch.softmax(outputs.logits, dim=-1)
|
| 115 |
-
is_eou = probs[0][1] > 0.
|
| 116 |
|
| 117 |
print(f"EOU Probability: {probs[0][1]:.3f}")
|
| 118 |
print(f"Is EOU: {is_eou}")
|
|
@@ -147,7 +146,7 @@ logits = outputs[0][0]
|
|
| 147 |
|
| 148 |
# Calculate probability
|
| 149 |
probs = np.exp(logits) / np.sum(np.exp(logits))
|
| 150 |
-
is_eou = probs[1] > 0.
|
| 151 |
|
| 152 |
print(f"EOU Probability: {probs[1]:.3f}")
|
| 153 |
print(f"Is EOU: {is_eou}")
|
|
@@ -169,10 +168,10 @@ This model is designed for:
|
|
| 169 |
|
| 170 |
The model was trained using knowledge distillation on a multilingual dataset:
|
| 171 |
|
| 172 |
-
- **English**:
|
| 173 |
-
- **Hindi**:
|
| 174 |
-
- **Spanish**:
|
| 175 |
-
- **Total**: ~
|
| 176 |
|
| 177 |
### Training Configuration
|
| 178 |
|
|
@@ -202,9 +201,8 @@ The model was evaluated on:
|
|
| 202 |
### Inference Speed
|
| 203 |
|
| 204 |
Approximate inference times (CPU, single sample):
|
| 205 |
-
-
|
| 206 |
-
- ONNX
|
| 207 |
-
- ONNX Quantized INT8: ~5-8ms
|
| 208 |
|
| 209 |
*Note: Actual speeds vary by hardware*
|
| 210 |
|
|
|
|
| 36 |
- **F1 Score**: 0.9150
|
| 37 |
- **Precision**: 0.9796
|
| 38 |
- **Recall**: 0.8584
|
|
|
|
| 39 |
|
| 40 |
## Model Variants
|
| 41 |
|
|
|
|
| 111 |
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128)
|
| 112 |
outputs = model(**inputs)
|
| 113 |
probs = torch.softmax(outputs.logits, dim=-1)
|
| 114 |
+
is_eou = probs[0][1] > 0.5 # Using optimal threshold
|
| 115 |
|
| 116 |
print(f"EOU Probability: {probs[0][1]:.3f}")
|
| 117 |
print(f"Is EOU: {is_eou}")
|
|
|
|
| 146 |
|
| 147 |
# Calculate probability
|
| 148 |
probs = np.exp(logits) / np.sum(np.exp(logits))
|
| 149 |
+
is_eou = probs[1] > 0.5 # Using optimal threshold
|
| 150 |
|
| 151 |
print(f"EOU Probability: {probs[1]:.3f}")
|
| 152 |
print(f"Is EOU: {is_eou}")
|
|
|
|
| 168 |
|
| 169 |
The model was trained using knowledge distillation on a multilingual dataset:
|
| 170 |
|
| 171 |
+
- **English**: 76,258 samples
|
| 172 |
+
- **Hindi**: 75,103 samples
|
| 173 |
+
- **Spanish**: 75,963 samples
|
| 174 |
+
- **Total**: ~211K samples
|
| 175 |
|
| 176 |
### Training Configuration
|
| 177 |
|
|
|
|
| 201 |
### Inference Speed
|
| 202 |
|
| 203 |
Approximate inference times (CPU, single sample):
|
| 204 |
+
- ONNX Optimized: ~70-120ms
|
| 205 |
+
- ONNX Quantized INT8: ~40-50ms
|
|
|
|
| 206 |
|
| 207 |
*Note: Actual speeds vary by hardware*
|
| 208 |
|