Upload 2 files
Browse files- .gitattributes +1 -0
- README.md +11 -0
- model_comparison_aaai.png +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
model_comparison_aaai.png filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -35,6 +35,17 @@ A 30B-parameter instruction-tuned language model optimized for reasoning, math,
|
|
| 35 |
| **Vocab size** | 64,000 |
|
| 36 |
| **Chat template** | ChatML (`<\|im_start\|>` / `<\|im_end\|>`) |
|
| 37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
## What is ADS?
|
| 39 |
|
| 40 |
**Adaptive Dual-Search Distillation** treats model fine-tuning as a constrained optimization problem inspired by Operations Research. The core mechanism is a dynamic loss function with a stateful dual penalty factor that adapts based on embedding space entropy — forcing the model to converge to high-confidence predictions at difficult reasoning points, without modifying the model architecture.
|
|
|
|
| 35 |
| **Vocab size** | 64,000 |
|
| 36 |
| **Chat template** | ChatML (`<\|im_start\|>` / `<\|im_end\|>`) |
|
| 37 |
|
| 38 |
+
## Benchmark Results (5-shot, acc_norm)
|
| 39 |
+
|
| 40 |
+
| Benchmark | Kai-30B-Instruct | Llama-3 70B | Qwen2.5 32B | Yi-34B | Llama-3 8B | Mistral 7B | Llama-2 7B |
|
| 41 |
+
|-----------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
|
| 42 |
+
| **ARC-C** | 64.0 | 83.0 | 70.5 | 65.3 | 60.1 | 55.5 | 53.0 |
|
| 43 |
+
| **HellaSwag** | 74.4 | 89.0 | 85.2 | 83.1 | 78.6 | 81.3 | 78.6 |
|
| 44 |
+
| **PIQA** | 84.8 | 85.0 | 84.1 | 82.5 | 79.8 | 82.1 | 78.1 |
|
| 45 |
+
| **Winogrande** | **86.4** | 83.0 | 78.2 | 76.4 | 73.0 | 74.0 | 69.1 |
|
| 46 |
+
|
| 47 |
+

|
| 48 |
+
|
| 49 |
## What is ADS?
|
| 50 |
|
| 51 |
**Adaptive Dual-Search Distillation** treats model fine-tuning as a constrained optimization problem inspired by Operations Research. The core mechanism is a dynamic loss function with a stateful dual penalty factor that adapts based on embedding space entropy — forcing the model to converge to high-confidence predictions at difficult reasoning points, without modifying the model architecture.
|
model_comparison_aaai.png
ADDED
|
Git LFS Details
|