richardyoung
/

OLMo-3-7B-RLZero-Math-GGUF

Text Generation

Model card Files Files and versions

richardyoung commited on Nov 26, 2025

Commit

5238d6e

·

verified ·

1 Parent(s): aa0cbd9

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +94 -0

README.md ADDED Viewed

	@@ -0,0 +1,94 @@

+---
+license: apache-2.0
+language:
+- en
+base_model: allenai/OLMo-3-7B-RLZero-Math
+tags:
+- gguf
+- mlx
+- ollama
+- math
+- reasoning
+- olmo
+model-index:
+- name: OLMo-3-7B-RLZero-Math-GGUF
+  results: []
+---
+# OLMo-3-7B-RLZero-Math - GGUF, MLX & Ollama
+Community quantizations of [allenai/OLMo-3-7B-RLZero-Math](https://huggingface.co/allenai/OLMo-3-7B-RLZero-Math) for efficient local inference.
+## Model Description
+OLMo-3-7B-RLZero-Math is a 7B parameter model fine-tuned for mathematical reasoning using reinforcement learning (RL-Zero approach). This repository provides quantized versions for various deployment scenarios.
+**Key Features:**
+- 65,536 token context length (with YaRN scaling)
+- Specialized for step-by-step mathematical problem solving
+- Apache 2.0 license
+## Available Formats
+### GGUF Quantizations (llama.cpp compatible)
+| Filename | Quant Type | Size | Description |
+|----------|-----------|------|-------------|
+| `Olmo-3-7B-RLZero-Math.gguf` | F16 | 14 GB | Full precision source |
+| `Olmo-3-7B-RLZero-Math-Q8_0.gguf` | Q8_0 | 7.2 GB | High quality, 8-bit |
+| `Olmo-3-7B-RLZero-Math-Q5_K_M.gguf` | Q5_K_M | 4.9 GB | Good balance |
+| `Olmo-3-7B-RLZero-Math-Q4_K_M.gguf` | Q4_K_M | 4.2 GB | Recommended for most users |
+| `Olmo-3-7B-RLZero-Math-IQ4_XS.gguf` | IQ4_XS | 3.8 GB | IQ 4.25 bpw |
+| `Olmo-3-7B-RLZero-Math-IQ3_M.gguf` | IQ3_M | 3.2 GB | IQ 3.66 bpw |
+### MLX Format (Apple Silicon)
+The `mlx/` folder contains a 4-bit quantized version optimized for Apple Silicon Macs.
+### Ollama
+```bash
+ollama run richardyoung/olmo-3-7b-rlzero-math
+```
+## Usage
+### llama.cpp
+```bash
+./llama-cli -m Olmo-3-7B-RLZero-Math-Q4_K_M.gguf -p "Solve: What is 15% of 240?" -n 512
+```
+### MLX (Apple Silicon)
+```bash
+pip install mlx-lm
+mlx_lm.generate --model mlx/ --prompt "Solve step by step: If a train travels 120 miles in 2 hours, what is its average speed?"
+```
+### Python with llama-cpp-python
+```python
+from llama_cpp import Llama
+llm = Llama(model_path="Olmo-3-7B-RLZero-Math-Q4_K_M.gguf", n_ctx=4096)
+output = llm("Solve: What is the derivative of x^2 + 3x?", max_tokens=256)
+print(output["choices"][0]["text"])
+```
+## Prompt Format
+The model uses a simple prompt format for math problems:
+```
+Solve the following math problem step by step:
+{problem}
+```
+## Credits
+- Original model: [Allen Institute for AI](https://allenai.org/)
+- Quantization: richardyoung
+## License
+Apache 2.0 (same as original model)