|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- qwen3 |
|
|
- reasoning |
|
|
- long-context |
|
|
- enterprise |
|
|
- research |
|
|
- conversational |
|
|
--- |
|
|
|
|
|
# DeepBrainz-R1-4B-16K |
|
|
|
|
|
DeepBrainz-R1-4B-16K is a compact, long-context reasoning model in the **DeepBrainz-R series**, designed for structured problem-solving, analysis, and enterprise research workflows. |
|
|
|
|
|
The model emphasizes **reasoning quality, instruction robustness, and stability over long contexts**, while remaining efficient to deploy on modern GPU inference runtimes. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Highlights |
|
|
|
|
|
- ~4B parameters |
|
|
- 16K context length |
|
|
- Optimized for reasoning-centric math and coding tasks |
|
|
- Designed for modern GPU inference runtimes |
|
|
- **Architecture:** Qwen3-compatible (DeepBrainz-R series, post-trained, and optimized for reasoning-centric workloads) |
|
|
|
|
|
--- |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
- Advanced reasoning systems |
|
|
- Math and Coding |
|
|
- Research and evaluation |
|
|
- Agentic workflows |
|
|
- Inference-time scaling and test-time compute experiments |
|
|
|
|
|
**Not intended** as a general-purpose chat replacement for large frontier models. |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model_id = "DeepBrainz/DeepBrainz-R1-4B-16K" |
|
|
|
|
|
tok = AutoTokenizer.from_pretrained(model_id) |
|
|
mdl = AutoModelForCausalLM.from_pretrained(model_id) |
|
|
|
|
|
prompt = "Solve step by step: If x + 5 = 12, what is x?" |
|
|
inputs = tok(prompt, return_tensors="pt") |
|
|
|
|
|
out = mdl.generate( |
|
|
**inputs, |
|
|
max_new_tokens=256, |
|
|
do_sample=True, |
|
|
temperature=0.6, |
|
|
top_p=0.95, |
|
|
) |
|
|
|
|
|
print(tok.decode(out[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Summary |
|
|
|
|
|
The model was produced using a **multi-stage optimization process** involving large-scale on-policy optimization and **iterative refinement** to improve reasoning quality and robustness. |
|
|
|
|
|
Specific training details are intentionally abstracted in this public release. |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
Performance depends on task complexity and inference configuration. |
|
|
Larger models may outperform R1-4B-16K on extremely complex tasks. |
|
|
|
|
|
--- |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 |
|
|
|
|
|
--- |
|
|
|
|
|
## About DeepBrainz |
|
|
|
|
|
DeepBrainz builds reasoning-first AI systems focused on efficiency, structure, and real-world problem-solving. |