richardyoung commited on
Commit
5238d6e
·
verified ·
1 Parent(s): aa0cbd9

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +94 -0
README.md ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model: allenai/OLMo-3-7B-RLZero-Math
6
+ tags:
7
+ - gguf
8
+ - mlx
9
+ - ollama
10
+ - math
11
+ - reasoning
12
+ - olmo
13
+ model-index:
14
+ - name: OLMo-3-7B-RLZero-Math-GGUF
15
+ results: []
16
+ ---
17
+
18
+ # OLMo-3-7B-RLZero-Math - GGUF, MLX & Ollama
19
+
20
+ Community quantizations of [allenai/OLMo-3-7B-RLZero-Math](https://huggingface.co/allenai/OLMo-3-7B-RLZero-Math) for efficient local inference.
21
+
22
+ ## Model Description
23
+
24
+ OLMo-3-7B-RLZero-Math is a 7B parameter model fine-tuned for mathematical reasoning using reinforcement learning (RL-Zero approach). This repository provides quantized versions for various deployment scenarios.
25
+
26
+ **Key Features:**
27
+ - 65,536 token context length (with YaRN scaling)
28
+ - Specialized for step-by-step mathematical problem solving
29
+ - Apache 2.0 license
30
+
31
+ ## Available Formats
32
+
33
+ ### GGUF Quantizations (llama.cpp compatible)
34
+
35
+ | Filename | Quant Type | Size | Description |
36
+ |----------|-----------|------|-------------|
37
+ | `Olmo-3-7B-RLZero-Math.gguf` | F16 | 14 GB | Full precision source |
38
+ | `Olmo-3-7B-RLZero-Math-Q8_0.gguf` | Q8_0 | 7.2 GB | High quality, 8-bit |
39
+ | `Olmo-3-7B-RLZero-Math-Q5_K_M.gguf` | Q5_K_M | 4.9 GB | Good balance |
40
+ | `Olmo-3-7B-RLZero-Math-Q4_K_M.gguf` | Q4_K_M | 4.2 GB | Recommended for most users |
41
+ | `Olmo-3-7B-RLZero-Math-IQ4_XS.gguf` | IQ4_XS | 3.8 GB | IQ 4.25 bpw |
42
+ | `Olmo-3-7B-RLZero-Math-IQ3_M.gguf` | IQ3_M | 3.2 GB | IQ 3.66 bpw |
43
+
44
+ ### MLX Format (Apple Silicon)
45
+
46
+ The `mlx/` folder contains a 4-bit quantized version optimized for Apple Silicon Macs.
47
+
48
+ ### Ollama
49
+
50
+ ```bash
51
+ ollama run richardyoung/olmo-3-7b-rlzero-math
52
+ ```
53
+
54
+ ## Usage
55
+
56
+ ### llama.cpp
57
+
58
+ ```bash
59
+ ./llama-cli -m Olmo-3-7B-RLZero-Math-Q4_K_M.gguf -p "Solve: What is 15% of 240?" -n 512
60
+ ```
61
+
62
+ ### MLX (Apple Silicon)
63
+
64
+ ```bash
65
+ pip install mlx-lm
66
+ mlx_lm.generate --model mlx/ --prompt "Solve step by step: If a train travels 120 miles in 2 hours, what is its average speed?"
67
+ ```
68
+
69
+ ### Python with llama-cpp-python
70
+
71
+ ```python
72
+ from llama_cpp import Llama
73
+
74
+ llm = Llama(model_path="Olmo-3-7B-RLZero-Math-Q4_K_M.gguf", n_ctx=4096)
75
+ output = llm("Solve: What is the derivative of x^2 + 3x?", max_tokens=256)
76
+ print(output["choices"][0]["text"])
77
+ ```
78
+
79
+ ## Prompt Format
80
+
81
+ The model uses a simple prompt format for math problems:
82
+ ```
83
+ Solve the following math problem step by step:
84
+ {problem}
85
+ ```
86
+
87
+ ## Credits
88
+
89
+ - Original model: [Allen Institute for AI](https://allenai.org/)
90
+ - Quantization: richardyoung
91
+
92
+ ## License
93
+
94
+ Apache 2.0 (same as original model)