Instructions to use amd/SAND-MathScience-DeepSeek-Qwen32B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use amd/SAND-MathScience-DeepSeek-Qwen32B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="amd/SAND-MathScience-DeepSeek-Qwen32B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("amd/SAND-MathScience-DeepSeek-Qwen32B") model = AutoModelForCausalLM.from_pretrained("amd/SAND-MathScience-DeepSeek-Qwen32B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use amd/SAND-MathScience-DeepSeek-Qwen32B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "amd/SAND-MathScience-DeepSeek-Qwen32B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "amd/SAND-MathScience-DeepSeek-Qwen32B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/amd/SAND-MathScience-DeepSeek-Qwen32B
- SGLang
How to use amd/SAND-MathScience-DeepSeek-Qwen32B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "amd/SAND-MathScience-DeepSeek-Qwen32B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "amd/SAND-MathScience-DeepSeek-Qwen32B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "amd/SAND-MathScience-DeepSeek-Qwen32B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "amd/SAND-MathScience-DeepSeek-Qwen32B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use amd/SAND-MathScience-DeepSeek-Qwen32B with Docker Model Runner:
docker model run hf.co/amd/SAND-MathScience-DeepSeek-Qwen32B
Upload folder using huggingface_hub
Browse files- .gitattributes +1 -0
- README.md +8 -6
- SAND-MATH-Blog.png +3 -0
.gitattributes
CHANGED
|
@@ -4,4 +4,5 @@
|
|
| 4 |
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 5 |
|
| 6 |
PipelineSimple.png filter=lfs diff=lfs merge=lfs -text
|
|
|
|
| 7 |
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
|
|
|
| 4 |
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 5 |
|
| 6 |
PipelineSimple.png filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
SAND-MATH-Blog.png filter=lfs diff=lfs merge=lfs -text
|
| 8 |
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -12,7 +12,9 @@ base_model:
|
|
| 12 |
- deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
|
| 13 |
---
|
| 14 |
|
| 15 |
-
#
|
|
|
|
|
|
|
| 16 |
|
| 17 |
| [](https://arxiv.org/pdf/2507.20527) | [](https://huggingface.co/datasets/amd/SAND-Post-Training-Dataset) | [](https://github.com/AMD-AGI/sand-pipeline) | [](https://rocm.blogs.amd.com/artificial-intelligence/sand-math/README.html) |
|
| 18 |
| :---: | :---: | :---: | :---: |
|
|
@@ -20,7 +22,7 @@ base_model:
|
|
| 20 |
|
| 21 |
## Model Summary
|
| 22 |
|
| 23 |
-
We introduce **SAND-Math-Qwen2.5-32B** and **SAND-MathScience-DeepSeek-Qwen32B**, reasoning models built entirely using a synthetic data pipeline running on the **AMD ROCm™ stack** and **AMD Instinct™ MI325 GPUs**.
|
| 24 |
|
| 25 |
By prioritizing data difficulty along with quantity, we demonstrate that high-difficulty synthetic data can elevate prior-generation models to match or exceed modern proprietary models. `SAND-Math-Qwen2.5-32B` is fine-tuned from **Qwen2.5-32B-Instruct** on just **14k synthetic math samples**, achieving strong reasoning capabilities with minimal data outperforming other data distillation and post training approaches. `SAND-MathScience-DeepSeek-Qwen32B` is fine-tuned from **DeepSeek-R1-Distill-Qwen-32B** on a compact dataset of **27k samples** (15k Math + 12k Science), achieving a generational leap in performance that rivals **Qwen3-32B**.
|
| 26 |
|
|
@@ -46,10 +48,10 @@ Using only **14k synthetic math samples** and standard SFT (no RL), our approach
|
|
| 46 |
| Model | Data Size | AIME24 | AIME25 | MATH500 | GPQA |
|
| 47 |
| :--- | :--- | :---: | :---: | :---: | :---: |
|
| 48 |
| Qwen2.5-32B-Instruct (Base) | - | 16.7 | 13.3 | 83.4 | 53.5 |
|
| 49 |
-
| DeepSeek-R1-Distill-Qwen-32B | 800k | 72.6 | 54.9 | 94.3 | 62.1 |
|
| 50 |
| Light-R1-32B | 79k | 73.0 | 64.3 | 93.3 | 60.6 |
|
| 51 |
| OpenThinker-32B | 114k | 66.0 | 53.3 | 89.4 | 57.6 |
|
| 52 |
-
| **SAND-Math-Qwen2.5-32B (Ours)** | **14k** | **74.01** | **68.18** |
|
| 53 |
|
| 54 |
---
|
| 55 |
|
|
@@ -57,7 +59,7 @@ Using only **14k synthetic math samples** and standard SFT (no RL), our approach
|
|
| 57 |
|
| 58 |
Our results are powered by a 4-stage automated pipeline running on AMD hardware that prioritizes **difficulty and novelty** over volume. Unlike datasets that recycle easy problems, our pipeline leverages a Teacher Model (`GPT-OSS120b`) to generate, validate, and systematically "hike" the difficulty of reasoning problems.
|
| 59 |
|
| 60 |
-

|
| 96 |
|
| 97 |
# Example prompt
|
| 98 |
-
prompt = "
|
| 99 |
messages = [
|
| 100 |
{"role": "user", "content": prompt}
|
| 101 |
]
|
|
|
|
| 12 |
- deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
|
| 13 |
---
|
| 14 |
|
| 15 |
+
# State-of-the-art Large Reasoning Model Built Using Only Synthetic Data on AMD GPUs
|
| 16 |
+
|
| 17 |
+
<div align="center">
|
| 18 |
|
| 19 |
| [](https://arxiv.org/pdf/2507.20527) | [](https://huggingface.co/datasets/amd/SAND-Post-Training-Dataset) | [](https://github.com/AMD-AGI/sand-pipeline) | [](https://rocm.blogs.amd.com/artificial-intelligence/sand-math/README.html) |
|
| 20 |
| :---: | :---: | :---: | :---: |
|
|
|
|
| 22 |
|
| 23 |
## Model Summary
|
| 24 |
|
| 25 |
+
We introduce **SAND-Math-Qwen2.5-32B** and **SAND-MathScience-DeepSeek-Qwen32B**, state-of-the-art reasoning models in the 32B parameter range, built entirely using a synthetic data pipeline running on the **AMD ROCm™ stack** and **AMD Instinct™ MI325 GPUs**.
|
| 26 |
|
| 27 |
By prioritizing data difficulty along with quantity, we demonstrate that high-difficulty synthetic data can elevate prior-generation models to match or exceed modern proprietary models. `SAND-Math-Qwen2.5-32B` is fine-tuned from **Qwen2.5-32B-Instruct** on just **14k synthetic math samples**, achieving strong reasoning capabilities with minimal data outperforming other data distillation and post training approaches. `SAND-MathScience-DeepSeek-Qwen32B` is fine-tuned from **DeepSeek-R1-Distill-Qwen-32B** on a compact dataset of **27k samples** (15k Math + 12k Science), achieving a generational leap in performance that rivals **Qwen3-32B**.
|
| 28 |
|
|
|
|
| 48 |
| Model | Data Size | AIME24 | AIME25 | MATH500 | GPQA |
|
| 49 |
| :--- | :--- | :---: | :---: | :---: | :---: |
|
| 50 |
| Qwen2.5-32B-Instruct (Base) | - | 16.7 | 13.3 | 83.4 | 53.5 |
|
| 51 |
+
| DeepSeek-R1-Distill-Qwen-32B | 800k | 72.6 | 54.9 | **94.3** | **62.1** |
|
| 52 |
| Light-R1-32B | 79k | 73.0 | 64.3 | 93.3 | 60.6 |
|
| 53 |
| OpenThinker-32B | 114k | 66.0 | 53.3 | 89.4 | 57.6 |
|
| 54 |
+
| **SAND-Math-Qwen2.5-32B (Ours)** | **14k** | **74.01** | **68.18** | 92.05 | 60.8 |
|
| 55 |
|
| 56 |
---
|
| 57 |
|
|
|
|
| 59 |
|
| 60 |
Our results are powered by a 4-stage automated pipeline running on AMD hardware that prioritizes **difficulty and novelty** over volume. Unlike datasets that recycle easy problems, our pipeline leverages a Teacher Model (`GPT-OSS120b`) to generate, validate, and systematically "hike" the difficulty of reasoning problems.
|
| 61 |
|
| 62 |
+

|
| 63 |
|
| 64 |
### Pipeline Stages
|
| 65 |
|
|
|
|
| 97 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 98 |
|
| 99 |
# Example prompt
|
| 100 |
+
prompt = "A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?"
|
| 101 |
messages = [
|
| 102 |
{"role": "user", "content": prompt}
|
| 103 |
]
|
SAND-MATH-Blog.png
ADDED
|
Git LFS Details
|