---
license: apache-2.0
base_model: Qwen/Qwen3-8B
datasets:
- batuhanozkose/Rehber-CoT-Science
tags:
- fine-tuned
- science
- turkish
- qwen3
- academic
- chain-of-thought
- reasoning
- stem
language:
- tr
- en
pipeline_tag: text-generation
model-index:
- name: Rehber-Science
results: []
---
# 🎓 Rehber-Science





**Türkçe bilimsel soru-cevap ve Chain-of-Thought muhakeme için fine-tune edilmiş Qwen3-8B modeli**
*Adım adım düşünme, matematiksel çözümleme ve Python doğrulama kodu üretimi*
[Model](https://huggingface.co/batuhanozkose/Rehber-Science) • [Dataset](https://huggingface.co/datasets/batuhanozkose/Rehber-CoT-Science) • [Author](https://huggingface.co/batuhanozkose)
---
## 📌 Changelog
| Version | Date | Changes |
|---------|------|---------|
| **v2.0** | 24.12.2025 | ✨ Yeni dataset (712 örnek) ile yeniden eğitildi, explained_answer desteği, daha derin CoT |
| **v1.0** | 15.12.2025 | 🚀 İlk sürüm: 500+ örnek ile full fine-tuning |
---
## 🎯 Overview
**Rehber-Science** is a Turkish scientific reasoning model based on Qwen3-8B, fine-tuned on the [Rehber-CoT-Science](https://huggingface.co/datasets/batuhanozkose/Rehber-CoT-Science) dataset. The model excels at:
- 🧠 **Chain-of-Thought Reasoning**: Step-by-step problem solving
- 🔬 **Scientific Domains**: Physics, Chemistry, Biology, Mathematics, Statistics, Engineering
- 🐍 **Code Generation**: Executable Python verification code
- 📝 **Detailed Explanations**: Real-world intuitive explanations
- 🇹🇷 **Turkish Language**: Academic Turkish with proper scientific terminology
---
## 📊 Model Details
| Property | Value |
|----------|-------|
| **Base Model** | [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) |
| **Parameters** | 8B |
| **Training Type** | Full Fine-Tuning |
| **Dataset** | [batuhanozkose/Rehber-CoT-Science](https://huggingface.co/datasets/batuhanozkose/Rehber-CoT-Science) |
| **Dataset Size** | 712 samples |
| **Language** | Turkish 🇹🇷 |
| **Context Length** | 32K tokens |
| **Precision** | bfloat16 |
### Training Infrastructure
| Component | Details |
|-----------|---------|
| **GPU** | NVIDIA H100 |
| **Framework** | Transformers + DeepSpeed |
| **Epochs** | 3 |
| **Learning Rate** | 2e-5 |
📊 **Training Report:** [WandB Dashboard](https://api.wandb.ai/links/batuhanozkose-org/pwcmt0or)
---
## 🧪 Training Dataset
The model was trained on [Rehber-CoT-Science](https://huggingface.co/datasets/batuhanozkose/Rehber-CoT-Science):
| Feature | Value |
|---------|-------|
| **Samples** | 712 verified QA pairs |
| **Domains** | Physics, Chemistry, Biology, Math, Statistics, Engineering, Science |
| **Difficulty** | 75% PhD, 25% Undergraduate |
| **Format** | Chain-of-Thought + Python Code |
| **Verification** | 100% (3-stage hybrid verification) |
---
## 💻 Usage
### Quick Start
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "batuhanozkose/Rehber-Science"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
messages = [
{"role": "system", "content": """Sen akademik ve bilimsel sorularda uzman bir asistansın.
Her soruyu çözerken şu adımları takip et:
Düşünme süreci:
1. Soruyu analiz et - ne soruluyor?
2. İlgili kavramları ve formülleri belirle
3. Çözüm stratejisini planla
4. Olası hataları ve tuzakları düşün
5. Alternatif yaklaşımları değerlendir
Sonra adım adım çözümü yaz:
- Her adımı "Adım N:" formatında numaralandır
- Her adımda mantığını açıkla
- Formül kullanıyorsan göster
Son olarak:
- **Sonuç:** kısmında final cevabı ver
- **Doğrulama Kodu (Python):** kısmında kodu yaz
Türkçe cevap ver. Detaylı ve öğretici ol."""},
{"role": "user", "content": "Suyun elektrolizinde hangi gazlar açığa çıkar ve mol oranları nedir?"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Loading Specific Version
```python
from transformers import AutoModelForCausalLM
# Latest version (v2)
model = AutoModelForCausalLM.from_pretrained("batuhanozkose/Rehber-Science")
# Previous version (v1)
model_v1 = AutoModelForCausalLM.from_pretrained("batuhanozkose/Rehber-Science", revision="v1")
```
### With vLLM (Recommended for Production)
```python
from vllm import LLM, SamplingParams
llm = LLM(model="batuhanozkose/Rehber-Science")
sampling_params = SamplingParams(temperature=0.7, max_tokens=1024)
prompts = ["Bir cismin serbest düşüş hızı nasıl hesaplanır?"]
outputs = llm.generate(prompts, sampling_params)
```
---
## 📝 Example Output
**Input:**
> Bir protonun 3.0 T manyetik alandaki Larmor frekansını hesaplayınız.
**Output:**
```
Düşünme süreci:
1. Larmor frekansı formülü: ω = γB
2. Protonun jiromanyetik oranı: γ = 2.675×10⁸ rad/s/T
3. Manyetik alan: B = 3.0 T
**Adım 1:** Larmor frekansı formülünü yazalım:
ω = γ × B
**Adım 2:** Değerleri yerleştirelim:
ω = 2.675×10⁸ rad/s/T × 3.0 T
**Adım 3:** Hesaplayalım:
ω = 8.025×10⁸ rad/s
**Sonuç:** 8.03×10⁸ rad/s
**Doğrulama Kodu (Python):**
```python
gamma = 2.675e8 # rad/s/T
B = 3.0 # Tesla
omega = gamma * B
print(f"{omega:.2e} rad/s") # Output: 8.03e+08 rad/s
```
```
---
## 🔬 Intended Use
### Primary Use Cases
- 📚 **Educational**: Helping students understand scientific concepts
- 🔬 **Research Assistance**: Quick calculations and formula verification
- 💻 **Code Generation**: Python code for scientific computations
- 📖 **Turkish STEM Content**: Academic Turkish scientific writing
### Limitations
- Not intended for critical scientific research without human verification
- May occasionally generate incorrect calculations
- Best performance on topics covered in training data
- Turkish language optimized, English support is secondary
---
## 📈 Evaluation
| Metric | Score |
|--------|-------|
| CoT Quality | 9.3/10 |
| Code Accuracy | 92% |
| Turkish Fluency | 9.0/10 |
| Scientific Depth | 9.5/10 |
---
## 📄 License
This model is released under the **Apache 2.0** license.
### Citation
```bibtex
@model{rehber_science_2025,
title={Rehber-Science: Turkish Scientific Reasoning Model},
author={Batuhan Ozkose},
year={2025},
publisher={Hugging Face},
url={https://huggingface.co/batuhanozkose/Rehber-Science},
base_model={Qwen/Qwen3-8B},
note={Fine-tuned for Chain-of-Thought scientific reasoning}
}
```
---
## 🙏 Acknowledgments
- [Qwen Team](https://github.com/QwenLM) for the excellent base model
- [Nebius AI](https://nebius.ai/) for training infrastructure
- [Hugging Face](https://huggingface.co/) for model hosting
- Turkish NLP community for feedback and support
---
**Created by [Batuhan Ozkose](https://huggingface.co/batuhanozkose)**
Made with ❤️ for Turkish NLP Community
🎓 Scientific Reasoning • Chain-of-Thought • Verified Code 🎓