Add model card with training config and eval metrics
Browse files
README.md
CHANGED
|
@@ -18,6 +18,8 @@ Part of a course project evaluating per-step weighted loss functions for trainin
|
|
| 18 |
EAGLE3 draft models. Full pipeline and source:
|
| 19 |
**https://github.com/XLOverflow/anlp_course_project**
|
| 20 |
|
|
|
|
|
|
|
| 21 |
## Training
|
| 22 |
|
| 23 |
- **Framework:** [SpecForge](https://github.com/sgl-project/SpecForge) (our fork: https://github.com/XLOverflow/SpecForge)
|
|
@@ -41,10 +43,9 @@ Baselines for reference: Vanilla ≈ 1× speedup, EAGLE-orig ≈ 2× speedup.
|
|
| 41 |
|
| 42 |
- `model.safetensors` — draft model weights (~763 MB)
|
| 43 |
- `config.json` — model config
|
| 44 |
-
-
|
| 45 |
|
| 46 |
-
Optimizer state (
|
| 47 |
-
repo's training scripts to resume from scratch if needed.
|
| 48 |
|
| 49 |
## Usage
|
| 50 |
|
|
|
|
| 18 |
EAGLE3 draft models. Full pipeline and source:
|
| 19 |
**https://github.com/XLOverflow/anlp_course_project**
|
| 20 |
|
| 21 |
+
Collection: [Qwen3 EAGLE3 — Weighted Loss Variants](https://huggingface.co/collections/XLOverflow/qwen3-eagle3-weighted-loss-variants)
|
| 22 |
+
|
| 23 |
## Training
|
| 24 |
|
| 25 |
- **Framework:** [SpecForge](https://github.com/sgl-project/SpecForge) (our fork: https://github.com/XLOverflow/SpecForge)
|
|
|
|
| 43 |
|
| 44 |
- `model.safetensors` — draft model weights (~763 MB)
|
| 45 |
- `config.json` — model config
|
| 46 |
+
- Corresponds to: `outputs/eagle3-adaspec/epoch_0_step_17026` in the original training output
|
| 47 |
|
| 48 |
+
Optimizer state (~3 GB) is not uploaded — use the project repo's training scripts to resume from scratch if needed.
|
|
|
|
| 49 |
|
| 50 |
## Usage
|
| 51 |
|