Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -69,6 +69,11 @@ print(generated_text)
|
|
| 69 |
- `"kl"`: KL divergence (D_KL(candidate || target)) - measures how much information is lost when using the candidate distribution to approximate the target
|
| 70 |
- `"js"`: Jensen-Shannon divergence - a symmetric and bounded measure of distribution similarity
|
| 71 |
- `"draft_tokens"`: Absolute difference between draft and target model probability of drafted token
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
|
| 73 |
### How It Works
|
| 74 |
|
|
|
|
| 69 |
- `"kl"`: KL divergence (D_KL(candidate || target)) - measures how much information is lost when using the candidate distribution to approximate the target
|
| 70 |
- `"js"`: Jensen-Shannon divergence - a symmetric and bounded measure of distribution similarity
|
| 71 |
- `"draft_tokens"`: Absolute difference between draft and target model probability of drafted token
|
| 72 |
+
- **`track_acceptance_metrics`** (bool, default: False): Whether to track and return draft token acceptance statistics. When enabled, the output includes:
|
| 73 |
+
- `draft_token_acceptance_rate`: Ratio of accepted draft tokens to total draft tokens
|
| 74 |
+
- `total_draft_tokens`: Total number of draft tokens generated
|
| 75 |
+
- `total_accepted_tokens`: Total number of draft tokens accepted
|
| 76 |
+
Set to `True` when you need to analyze acceptance rates for performance evaluation. When `False` (default), no tracking computation is performed, minimizing overhead for production use.
|
| 77 |
|
| 78 |
### How It Works
|
| 79 |
|