Update README.md
Browse files
README.md
CHANGED
|
@@ -43,12 +43,6 @@ For quantized baselines, we use **unsloth/DeepSeek-R1-Distill-Qwen-1.5B-Q8\_0**
|
|
| 43 |
|
| 44 |
<hr style="margin: 4px 0px;">
|
| 45 |
|
| 46 |
-
## Hardware Used
|
| 47 |
-
|
| 48 |
-
The quantization-aware training (QAT) of the Dheyo models and the benchmarking of all model variants were conducted on a cluster equipped with **AMD Instinct MI300X GPUs**. Each GPU provides 192 GB of HBM3 memory and high memory bandwidth, making them well-suited for running large quantized models efficiently. All evaluations were performed using the **llama.cpp** backend, optimized for low-precision inference.
|
| 49 |
-
|
| 50 |
-
<hr style="margin: 4px 0px;">
|
| 51 |
-
|
| 52 |
## Benchmark Results
|
| 53 |
|
| 54 |
### Math500 Pass@1 and GPQA Diamond Pass@1 Benchmarks
|
|
|
|
| 43 |
|
| 44 |
<hr style="margin: 4px 0px;">
|
| 45 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
## Benchmark Results
|
| 47 |
|
| 48 |
### Math500 Pass@1 and GPQA Diamond Pass@1 Benchmarks
|