Update README.md
Browse files
README.md
CHANGED
|
@@ -30,11 +30,11 @@ pipeline_tag: text-generation
|
|
| 30 |
|
| 31 |
## 1. Introduction
|
| 32 |
|
| 33 |
-
**Alpie Core is one of the first fine-tuned 4-bit reasoning models from India, and among one of the first worldwide.** Trained on just 8 Hopper GPUs using LoRA for parameter-efficient fine-tuning, combined with QLoRA 4-bit quantization, and synthetic STEM-rich dataset distillation, it proves that aggressive quantization can not only match but also surpass full-precision baselines.
|
| 34 |
|
| 35 |
With a dramatically reduced memory footprint, Alpie Core delivers competitive, frontier-level reasoning performance, even beating some top proprietary models. It achieves **81.28% on MMLU, 92.75% on GSM8K, and 57.8% on SWE-Bench Verified**, ranking top globally on competitive leaderboards, a demonstration that efficient models can rival frontier systems while remaining practical for real-world deployment at scale.
|
| 36 |
|
| 37 |
-

|
| 99 |
-
|
| 100 |
|
| 101 |
| Benchmark | Alpie Core (32B-4bit) | DeepSeek-V2 (236B) | Qwen2.5 72B | Llama 3.1 405B | Llama 3.1 70B | Gemma-3 27B-PT | Mistral-Small-24B-Base-2501 |
|
| 102 |
|-----------|----------------------|-------------------|-------------|---------------|---------------|----------------|----------------------------|
|
|
@@ -122,7 +118,6 @@ These results demonstrate Alpie Core’s ability to rival or surpass leading pro
|
|
| 122 |
| 6 | DeepSeek R1 | 49.2 | Below Alpie |
|
| 123 |
| 7 | Devstral | 46.8 | Below Alpie |
|
| 124 |
|
| 125 |
-

|
| 126 |
|
| 127 |
### Humanity's Last Exam Leaderboard Performance
|
| 128 |
|
|
|
|
| 30 |
|
| 31 |
## 1. Introduction
|
| 32 |
|
| 33 |
+
**Alpie Core is one of the first fine-tuned 4-bit reasoning models from India, and among one of the first worldwide at this scale.** Trained on just 8 Hopper GPUs using LoRA for parameter-efficient fine-tuning, combined with QLoRA 4-bit quantization, and synthetic STEM-rich dataset distillation, it proves that aggressive quantization can not only match but also surpass full-precision baselines.
|
| 34 |
|
| 35 |
With a dramatically reduced memory footprint, Alpie Core delivers competitive, frontier-level reasoning performance, even beating some top proprietary models. It achieves **81.28% on MMLU, 92.75% on GSM8K, and 57.8% on SWE-Bench Verified**, ranking top globally on competitive leaderboards, a demonstration that efficient models can rival frontier systems while remaining practical for real-world deployment at scale.
|
| 36 |
|
| 37 |
+

|
| 38 |
|
| 39 |
## 2. Model Summary
|
| 40 |
|
|
|
|
| 92 |
|
| 93 |
## 6. Benchmark Results
|
| 94 |
|
| 95 |
+

|
|
|
|
|
|
|
|
|
|
|
|
|
| 96 |
|
| 97 |
| Benchmark | Alpie Core (32B-4bit) | DeepSeek-V2 (236B) | Qwen2.5 72B | Llama 3.1 405B | Llama 3.1 70B | Gemma-3 27B-PT | Mistral-Small-24B-Base-2501 |
|
| 98 |
|-----------|----------------------|-------------------|-------------|---------------|---------------|----------------|----------------------------|
|
|
|
|
| 118 |
| 6 | DeepSeek R1 | 49.2 | Below Alpie |
|
| 119 |
| 7 | Devstral | 46.8 | Below Alpie |
|
| 120 |
|
|
|
|
| 121 |
|
| 122 |
### Humanity's Last Exam Leaderboard Performance
|
| 123 |
|