Instructions to use GusLovesMath/LlaMATH-3-8B-Instruct-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use GusLovesMath/LlaMATH-3-8B-Instruct-4bit with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("GusLovesMath/LlaMATH-3-8B-Instruct-4bit") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- MLX LM
How to use GusLovesMath/LlaMATH-3-8B-Instruct-4bit with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "GusLovesMath/LlaMATH-3-8B-Instruct-4bit"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "GusLovesMath/LlaMATH-3-8B-Instruct-4bit" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GusLovesMath/LlaMATH-3-8B-Instruct-4bit", "messages": [ {"role": "user", "content": "Hello"} ] }'
Update README.md
Browse files
README.md
CHANGED
|
@@ -181,7 +181,8 @@ extra_gated_button_content: Submit
|
|
| 181 |
|
| 182 |
# GusLovesMath/LlaMATH-3-8B-Instruct-4bit
|
| 183 |
This model was converted to MLX format from [`mlx-community/Meta-Llama-3-8B-Instruct-4bit`]() using mlx-lm version **0.12.1**.
|
| 184 |
-
Refer to the [original model card](https://huggingface.co/mlx-community/Meta-Llama-3-8B-Instruct-4bit) for more details on the model.
|
|
|
|
| 185 |
## Use with mlx
|
| 186 |
|
| 187 |
```bash
|
|
@@ -194,3 +195,40 @@ from mlx_lm import load, generate
|
|
| 194 |
model, tokenizer = load("GusLovesMath/LlaMATH-3-8B-Instruct-4bit")
|
| 195 |
response = generate(model, tokenizer, prompt="hello", verbose=True)
|
| 196 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 181 |
|
| 182 |
# GusLovesMath/LlaMATH-3-8B-Instruct-4bit
|
| 183 |
This model was converted to MLX format from [`mlx-community/Meta-Llama-3-8B-Instruct-4bit`]() using mlx-lm version **0.12.1**.
|
| 184 |
+
Refer to the [original model card](https://huggingface.co/mlx-community/Meta-Llama-3-8B-Instruct-4bit) for more details on the model.
|
| 185 |
+
**Note:** This model was trained locally on an M2 Pro chip with 16GB of RAM, 16 GPUs, and CPUs.
|
| 186 |
## Use with mlx
|
| 187 |
|
| 188 |
```bash
|
|
|
|
| 195 |
model, tokenizer = load("GusLovesMath/LlaMATH-3-8B-Instruct-4bit")
|
| 196 |
response = generate(model, tokenizer, prompt="hello", verbose=True)
|
| 197 |
```
|
| 198 |
+
|
| 199 |
+
Try the following prompt.
|
| 200 |
+
```python
|
| 201 |
+
# Our Prompt
|
| 202 |
+
prompt = """
|
| 203 |
+
Q A new program had 60 downloads in the first month.
|
| 204 |
+
The number of downloads in the second month was three
|
| 205 |
+
times as many as the downloads in the first month,
|
| 206 |
+
but then reduced by 30% in the third month. How many
|
| 207 |
+
downloads did the program have total over the three months?
|
| 208 |
+
"""
|
| 209 |
+
print(f"Our Test Prompt")
|
| 210 |
+
print(f"Q {prompt}")
|
| 211 |
+
|
| 212 |
+
# Testing model with prompt
|
| 213 |
+
response = generate(
|
| 214 |
+
model,
|
| 215 |
+
tokenizer,
|
| 216 |
+
prompt=prompt,
|
| 217 |
+
max_tokens=132,
|
| 218 |
+
temp=0.0,
|
| 219 |
+
verbose=False
|
| 220 |
+
)
|
| 221 |
+
|
| 222 |
+
# Printing models repsonse
|
| 223 |
+
print(f'LlaMATH Response')
|
| 224 |
+
print(response)
|
| 225 |
+
```
|
| 226 |
+
|
| 227 |
+
```bash
|
| 228 |
+
A: The number of downloads in the first month was 60.
|
| 229 |
+
The number of downloads in the second month was three times as many as the downloads in the first month, so it was 60 * 3 = <<60*3=180>>180.
|
| 230 |
+
The number of downloads in the third month was 30% less than the number of downloads in the second month, so it was 180 * 0.7 = <<180*0.7=126>>126.
|
| 231 |
+
The total number of downloads over the three months was 60 + 180 + 126 = <<60+180+126=366>>366.
|
| 232 |
+
#### 366
|
| 233 |
+
```
|
| 234 |
+
|