Update README.md
Browse files
README.md
CHANGED
|
@@ -10,422 +10,73 @@ pipeline_tag: text-generation
|
|
| 10 |
datasets:
|
| 11 |
- nvidia/OpenMathReasoning
|
| 12 |
---
|
| 13 |
-
🧠 DogeAI-v2.0-4B-Reasoning
|
| 14 |
-
📌 Model Details
|
| 15 |
-
Model Description
|
| 16 |
-
|
| 17 |
-
DogeAI-v2.0-4B-Reasoning is a language model focused on reasoning, structured thinking, and analytical responses, created from merging a reasoning LoRA onto the Qwen3-4B-Base model.
|
| 18 |
-
|
| 19 |
-
The main objective of this model is to improve logical coherence, the ability to solve problems in multiple steps, and explanatory clarity, without drastically altering the overall behavior of the base model.
|
| 20 |
-
|
| 21 |
-
This model represents the merged and final version, and can be used without dependence on external LoRA.
|
| 22 |
-
|
| 23 |
-
Developed by: AxionLab-Co
|
| 24 |
-
|
| 25 |
-
Funded by: Independent / Community-driven
|
| 26 |
-
|
| 27 |
-
Shared by: AxionLab-Co
|
| 28 |
-
|
| 29 |
-
Model type: Decoder-only Transformer (Causal Language Model)
|
| 30 |
-
|
| 31 |
-
Language(s) (NLP): Primarily English
|
| 32 |
-
|
| 33 |
-
License: Apache 2.0 (inherits from base model)
|
| 34 |
-
|
| 35 |
-
Finetuned from model: Qwen3-4B-Base
|
| 36 |
-
|
| 37 |
-
🔗 Model Sources
|
| 38 |
-
Repository: Hugging Face – AxionLab-Co/DogeAI-v2.0-4B-Reasoning
|
| 39 |
-
|
| 40 |
-
Base Model: Qwen/Qwen3-4B-Base
|
| 41 |
-
|
| 42 |
-
Training Platform: Kaggle
|
| 43 |
-
|
| 44 |
-
Frameworks: PyTorch, Transformers, PEFT
|
| 45 |
-
|
| 46 |
-
🎯 Uses
|
| 47 |
-
Direct Use
|
| 48 |
-
This model can be used directly for:
|
| 49 |
-
|
| 50 |
-
Logical and analytical reasoning
|
| 51 |
-
|
| 52 |
-
Multi-step problem solving
|
| 53 |
-
|
| 54 |
-
Detailed explanations (“Thinking-Style Responses”)
|
| 55 |
-
|
| 56 |
-
AI Research, Experimentation, and Learning
|
| 57 |
-
|
| 58 |
-
Downstream Use
|
| 59 |
-
|
| 60 |
-
Conversational agents focused on reasoning
|
| 61 |
-
|
| 62 |
-
Additional fine-tuning in specific domains
|
| 63 |
-
|
| 64 |
-
Conversion to GGUF and use in engines like llama.cpp
|
| 65 |
-
|
| 66 |
-
Academic or experimental research
|
| 67 |
-
|
| 68 |
-
Out-of-Scope Use
|
| 69 |
-
|
| 70 |
-
This model is not recommended for:
|
| 71 |
-
|
| 72 |
-
Medical, legal, or financial decisions
|
| 73 |
-
|
| 74 |
-
Critical safety applications
|
| 75 |
-
|
| 76 |
-
Use where absolute factuality is mandatory
|
| 77 |
-
|
| 78 |
-
⚠️ Bias, Risks, and Limitations
|
| 79 |
-
May generate excessive reasoning chains, even when unnecessary
|
| 80 |
-
|
| 81 |
-
Inherited potential biases from the base model and training data
|
| 82 |
-
|
| 83 |
-
Has not undergone specific alignment or safety fine-tuning
|
| 84 |
-
|
| 85 |
-
Generated reasoning is not guaranteed to be correct
|
| 86 |
-
|
| 87 |
-
Recommendations
|
| 88 |
-
|
| 89 |
-
Users should:
|
| 90 |
-
|
| 91 |
-
Critically evaluate responses
|
| 92 |
-
|
| 93 |
-
Use additional layers of security in production
|
| 94 |
-
|
| 95 |
-
Avoid blindly trusting chains of reasoning
|
| 96 |
-
|
| 97 |
-
🚀 How to Get Started with the Model
|
| 98 |
-
'' from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 99 |
-
|
| 100 |
-
model = AutoModelForCausalLM.from_pretrained( "AxionLab-Co/DogeAI-v2.0-4B-Reasoning", device_map="auto", torch_dtype="auto" )
|
| 101 |
-
|
| 102 |
-
tokenizer = AutoTokenizer.from_pretrained( "AxionLab-Co/DogeAI-v2.0-4B-Reasoning" )
|
| 103 |
-
|
| 104 |
-
inputs = tokenizer("Solve this step by step:", return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=256)
|
| 105 |
-
|
| 106 |
-
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ''
|
| 107 |
-
|
| 108 |
-
🏋️Training Details
|
| 109 |
-
Training Data
|
| 110 |
-
|
| 111 |
-
The model was fitted using datasets focused on reasoning and chain-of-thought, containing:
|
| 112 |
-
|
| 113 |
-
Step-by-step problem solving
|
| 114 |
-
|
| 115 |
-
Structured explanatory responses
|
| 116 |
-
|
| 117 |
-
Synthetic and curated analytical prompts
|
| 118 |
-
|
| 119 |
-
The data were manually pre-processed to improve quality and consistency.
|
| 120 |
-
|
| 121 |
-
Training Procedure Preprocessing
|
| 122 |
-
|
| 123 |
-
Tokenization with Qwen's original tokenizer
|
| 124 |
-
|
| 125 |
-
Filtering of inconsistent or low-quality examples
|
| 126 |
-
|
| 127 |
-
Training Hyperparameters
|
| 128 |
-
|
| 129 |
-
Training regime: fp16 mixed precision
|
| 130 |
-
|
| 131 |
-
Fine-tuning method: LoRA (PEFT)
|
| 132 |
-
|
| 133 |
-
Optimizer: AdamW
|
| 134 |
-
|
| 135 |
-
Framework: Transformers + PEFT
|
| 136 |
-
|
| 137 |
-
Speeds, Sizes, Times
|
| 138 |
-
|
| 139 |
-
Training performed on Kaggle GPU
|
| 140 |
-
|
| 141 |
-
LoRA intentionally kept lightweight
|
| 142 |
-
|
| 143 |
-
Final merge performed via PEFT (merge_and_unload)
|
| 144 |
-
|
| 145 |
-
📊 Evaluation
|
| 146 |
-
Testing Data, Factors & Metrics Testing Data
|
| 147 |
-
|
| 148 |
-
Manual reasoning prompts
|
| 149 |
-
|
| 150 |
-
Direct comparison with the base model
|
| 151 |
-
|
| 152 |
-
Factors
|
| 153 |
-
|
| 154 |
-
Clarity of reasoning
|
| 155 |
-
|
| 156 |
-
Logical coherence
|
| 157 |
-
|
| 158 |
-
Tendency to hallucination
|
| 159 |
-
|
| 160 |
-
Metrics
|
| 161 |
-
|
| 162 |
-
Qualitative human evaluation
|
| 163 |
-
|
| 164 |
-
Subjective comparison of responses
|
| 165 |
-
|
| 166 |
-
Results
|
| 167 |
-
|
| 168 |
-
The model demonstrates better logical organization and more concise explanations Consistent in direct comparison with Qwen3-4B-Base.
|
| 169 |
-
|
| 170 |
-
Summary
|
| 171 |
-
|
| 172 |
-
DogeAI-v2.0-4B-Reasoning prioritizes quality of thought, not just textual fluency.
|
| 173 |
-
|
| 174 |
-
🌱 Environmental Impact
|
| 175 |
-
Hardware Type: NVIDIA GPU (Kaggle)
|
| 176 |
-
|
| 177 |
-
Hours used: Few hours (single-session fine-tuning + merge)
|
| 178 |
-
|
| 179 |
-
Cloud Provider: Kaggle
|
| 180 |
-
|
| 181 |
-
Compute Region: Unknown
|
| 182 |
-
|
| 183 |
-
Carbon Emitted: Not measured
|
| 184 |
-
|
| 185 |
-
⚙️ Technical Specifications
|
| 186 |
-
Model Architecture and Objective
|
| 187 |
-
Decoder-only Transformer
|
| 188 |
-
|
| 189 |
-
Objective: Improve reasoning via efficient fine-tuning
|
| 190 |
-
|
| 191 |
-
Compute Infrastructure Hardware
|
| 192 |
-
|
| 193 |
-
NVIDIA GPU (Kaggle environment)
|
| 194 |
-
|
| 195 |
-
Software
|
| 196 |
-
|
| 197 |
-
PyTorch
|
| 198 |
-
|
| 199 |
-
Transformers
|
| 200 |
-
|
| 201 |
-
PEFT 0.18.1
|
| 202 |
-
|
| 203 |
-
📚 Citation
|
| 204 |
-
If you use this model in research or derivative projects, please cite the base model and this repository.
|
| 205 |
-
|
| 206 |
-
👥 Model Card Authors
|
| 207 |
-
AxionLab-Co
|
| 208 |
-
|
| 209 |
-
📬 Model Card Contact
|
| 210 |
-
For questions, feedback, or collaboration: AxionLab-Co – Hugging Face
|
| 211 |
-
# --FOR PORTUGUESE READERS --
|
| 212 |
# 🧠 DogeAI-v2.0-4B-Reasoning
|
| 213 |
-
|
| 214 |
-
**Model Description**
|
| 215 |
-
|
| 216 |
-
DogeAI-v2.0-4B-Reasoning é um modelo de linguagem focado em raciocínio, pensamento estruturado e respostas analíticas, criado a partir do merge de uma LoRA de reasoning sobre o modelo base Qwen3-4B-Base.
|
| 217 |
|
| 218 |
-
|
| 219 |
|
| 220 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 221 |
|
| 222 |
-
|
| 223 |
-
|
| 224 |
-
Funded by: Independent / Community-driven
|
| 225 |
-
|
| 226 |
-
Shared by: AxionLab-Co
|
| 227 |
-
|
| 228 |
-
Model type: Decoder-only Transformer (Causal Language Model)
|
| 229 |
-
|
| 230 |
-
Language(s) (NLP): Primarily English
|
| 231 |
-
|
| 232 |
-
License: Apache 2.0 (inherits from base model)
|
| 233 |
-
|
| 234 |
-
Finetuned from model: Qwen3-4B-Base
|
| 235 |
-
|
| 236 |
-
# 🔗 Model Sources
|
| 237 |
-
|
| 238 |
-
Repository: Hugging Face – AxionLab-Co/DogeAI-v2.0-4B-Reasoning
|
| 239 |
-
|
| 240 |
-
Base Model: Qwen/Qwen3-4B-Base
|
| 241 |
-
|
| 242 |
-
Training Platform: Kaggle
|
| 243 |
-
|
| 244 |
-
Frameworks: PyTorch, Transformers, PEFT
|
| 245 |
-
|
| 246 |
-
# 🎯 Uses
|
| 247 |
-
# Direct Use
|
| 248 |
-
|
| 249 |
-
Este modelo pode ser utilizado diretamente para:
|
| 250 |
-
|
| 251 |
-
Raciocínio lógico e analítico
|
| 252 |
-
|
| 253 |
-
Resolução de problemas em múltiplos passos
|
| 254 |
-
|
| 255 |
-
Explicações detalhadas (“thinking-style responses”)
|
| 256 |
-
|
| 257 |
-
Pesquisa, experimentação e aprendizado em IA
|
| 258 |
-
|
| 259 |
-
Downstream Use
|
| 260 |
-
|
| 261 |
-
Conversational agents focados em reasoning
|
| 262 |
-
|
| 263 |
-
Fine-tuning adicional em domínios específicos
|
| 264 |
-
|
| 265 |
-
Conversão para GGUF e uso em engines como llama.cpp
|
| 266 |
-
|
| 267 |
-
Pesquisa acadêmica ou experimental
|
| 268 |
-
|
| 269 |
-
Out-of-Scope Use
|
| 270 |
-
|
| 271 |
-
Este modelo não é recomendado para:
|
| 272 |
-
|
| 273 |
-
Decisões médicas, legais ou financeiras
|
| 274 |
-
|
| 275 |
-
Aplicações críticas de segurança
|
| 276 |
-
|
| 277 |
-
Uso onde factualidade absoluta é obrigatória
|
| 278 |
-
|
| 279 |
-
# ⚠️ Bias, Risks, and Limitations
|
| 280 |
-
|
| 281 |
-
Pode gerar cadeias de raciocínio excessivas, mesmo quando não necessárias
|
| 282 |
-
|
| 283 |
-
Herdou possíveis vieses do modelo base e dos dados de treino
|
| 284 |
-
|
| 285 |
-
Não passou por fine-tuning específico de alinhamento ou safety
|
| 286 |
-
|
| 287 |
-
Raciocínios gerados não são garantidamente corretos
|
| 288 |
-
|
| 289 |
-
Recommendations
|
| 290 |
-
|
| 291 |
-
Usuários devem:
|
| 292 |
|
| 293 |
-
|
|
|
|
|
|
|
|
|
|
| 294 |
|
| 295 |
-
|
| 296 |
|
| 297 |
-
|
| 298 |
|
| 299 |
-
|
| 300 |
-
|
|
|
|
| 301 |
|
|
|
|
| 302 |
|
|
|
|
| 303 |
model = AutoModelForCausalLM.from_pretrained(
|
| 304 |
-
|
| 305 |
-
device_map="auto",
|
| 306 |
-
torch_dtype=
|
| 307 |
)
|
| 308 |
|
|
|
|
|
|
|
| 309 |
|
| 310 |
-
|
| 311 |
-
|
|
|
|
|
|
|
|
|
|
| 312 |
)
|
| 313 |
|
| 314 |
-
|
| 315 |
-
|
| 316 |
-
|
| 317 |
-
|
| 318 |
-
|
| 319 |
-
|
| 320 |
-
|
| 321 |
-
|
| 322 |
-
|
| 323 |
-
|
| 324 |
-
|
| 325 |
-
|
| 326 |
-
|
| 327 |
-
|
| 328 |
-
|
| 329 |
-
|
| 330 |
-
|
| 331 |
-
|
| 332 |
-
|
| 333 |
-
|
| 334 |
-
|
| 335 |
-
|
| 336 |
-
|
| 337 |
-
|
| 338 |
-
|
| 339 |
-
|
| 340 |
-
Training Hyperparameters
|
| 341 |
-
|
| 342 |
-
Training regime: fp16 mixed precision
|
| 343 |
-
|
| 344 |
-
Fine-tuning method: LoRA (PEFT)
|
| 345 |
-
|
| 346 |
-
Optimizer: AdamW
|
| 347 |
-
|
| 348 |
-
Framework: Transformers + PEFT
|
| 349 |
-
|
| 350 |
-
Speeds, Sizes, Times
|
| 351 |
-
|
| 352 |
-
Treinamento realizado em GPU do Kaggle
|
| 353 |
-
|
| 354 |
-
LoRA mantida propositalmente leve
|
| 355 |
-
|
| 356 |
-
Merge final realizado via PEFT (merge_and_unload)
|
| 357 |
-
|
| 358 |
-
# 📊 Evaluation
|
| 359 |
-
Testing Data, Factors & Metrics
|
| 360 |
-
Testing Data
|
| 361 |
-
|
| 362 |
-
Prompts manuais de reasoning
|
| 363 |
-
|
| 364 |
-
Comparação direta com o modelo base
|
| 365 |
-
|
| 366 |
-
Factors
|
| 367 |
-
|
| 368 |
-
Clareza do raciocínio
|
| 369 |
-
|
| 370 |
-
Coerência lógica
|
| 371 |
-
|
| 372 |
-
Tendência a alucinação
|
| 373 |
-
|
| 374 |
-
Metrics
|
| 375 |
-
|
| 376 |
-
Avaliação qualitativa humana
|
| 377 |
-
|
| 378 |
-
Comparação subjetiva de respostas
|
| 379 |
-
|
| 380 |
-
Results
|
| 381 |
-
|
| 382 |
-
O modelo demonstra melhor organização lógica e explicações mais consistentes em comparação direta com o Qwen3-4B-Base.
|
| 383 |
-
|
| 384 |
-
Summary
|
| 385 |
-
|
| 386 |
-
DogeAI-v2.0-4B-Reasoning prioriza qualidade de pensamento, não apenas fluência textual.
|
| 387 |
-
|
| 388 |
-
# 🌱 Environmental Impact
|
| 389 |
-
|
| 390 |
-
Hardware Type: NVIDIA GPU (Kaggle)
|
| 391 |
-
|
| 392 |
-
Hours used: Few hours (single-session fine-tuning + merge)
|
| 393 |
-
|
| 394 |
-
Cloud Provider: Kaggle
|
| 395 |
-
|
| 396 |
-
Compute Region: Unknown
|
| 397 |
-
|
| 398 |
-
Carbon Emitted: Not measured
|
| 399 |
-
|
| 400 |
-
# ⚙️ Technical Specifications
|
| 401 |
-
# Model Architecture and Objective
|
| 402 |
-
|
| 403 |
-
Decoder-only Transformer
|
| 404 |
-
|
| 405 |
-
Objetivo: melhorar raciocínio via fine-tuning eficiente
|
| 406 |
-
|
| 407 |
-
Compute Infrastructure
|
| 408 |
-
Hardware
|
| 409 |
-
|
| 410 |
-
NVIDIA GPU (Kaggle environment)
|
| 411 |
-
|
| 412 |
-
Software
|
| 413 |
-
|
| 414 |
-
PyTorch
|
| 415 |
-
|
| 416 |
-
Transformers
|
| 417 |
-
|
| 418 |
-
PEFT 0.18.1
|
| 419 |
-
|
| 420 |
-
# 📚 Citation
|
| 421 |
-
|
| 422 |
-
Se você utilizar este modelo em pesquisas ou projetos derivados, cite o modelo base e este repositório.
|
| 423 |
-
|
| 424 |
-
# 👥 Model Card Authors
|
| 425 |
-
|
| 426 |
-
AxionLab-Co
|
| 427 |
-
|
| 428 |
-
# 📬 Model Card Contact
|
| 429 |
-
|
| 430 |
-
Para dúvidas, feedback ou colaboração:
|
| 431 |
-
AxionLab-Co – Hugging Face
|
|
|
|
| 10 |
datasets:
|
| 11 |
- nvidia/OpenMathReasoning
|
| 12 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
# 🧠 DogeAI-v2.0-4B-Reasoning
|
| 14 |
+
**"The Small Model That Thinks Big."**
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
+
DogeAI-v2.0-4B-Reasoning is a high-efficiency model optimized for **Chain-of-Thought (CoT)**. Built by [AxionLab-Co](https://huggingface.co), it merges a specialized reasoning LoRA onto the powerful **Qwen3-4B-Base** architecture, delivering structured, step-by-step analytical capabilities in a compact 4B footprint.
|
| 17 |
|
| 18 |
+
### 🚀 Key Highlights
|
| 19 |
+
- **Architecture:** Decoder-only Transformer (Qwen3 Base).
|
| 20 |
+
- **Core Strength:** Multi-step logical reasoning and structured problem solving.
|
| 21 |
+
- **Hardware Friendly:** Optimized for local inference (Low VRAM usage).
|
| 22 |
+
- **Final Merge:** No LoRA dependency; ready for production or GGUF conversion.
|
| 23 |
|
| 24 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
+
## 🎯 Use Cases
|
| 27 |
+
- **Complex Problem Solving:** Math, logic, and analytical tasks.
|
| 28 |
+
- **Detailed Explanations:** When you need the "why" and "how", not just the "what".
|
| 29 |
+
- **Local Agents:** High-performance reasoning for edge devices and local LLM setups.
|
| 30 |
|
| 31 |
+
---
|
| 32 |
|
| 33 |
+
## 🛠️ Quick Start
|
| 34 |
|
| 35 |
+
```python
|
| 36 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 37 |
+
import torch
|
| 38 |
|
| 39 |
+
model_id = "AxionLab-Co/DogeAI-v2.0-4B-Reasoning"
|
| 40 |
|
| 41 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 42 |
model = AutoModelForCausalLM.from_pretrained(
|
| 43 |
+
model_id,
|
| 44 |
+
device_map="auto",
|
| 45 |
+
torch_dtype=torch.bfloat16 # Recommended for Qwen3
|
| 46 |
)
|
| 47 |
|
| 48 |
+
prompt = "Solve this step-by-step: If a train leaves at 2 PM at 60mph, and another..."
|
| 49 |
+
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
| 50 |
|
| 51 |
+
outputs = model.generate(
|
| 52 |
+
**inputs,
|
| 53 |
+
max_new_tokens=512,
|
| 54 |
+
temperature=0.3, # Lower temp recommended for reasoning
|
| 55 |
+
do_sample=True
|
| 56 |
)
|
| 57 |
|
| 58 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 59 |
+
|
| 60 |
+
🏋️ Training & Methodology
|
| 61 |
+
Our goal at AxionLab was to prioritize Depth of Thought over mere textual fluency.
|
| 62 |
+
Dataset: A curated mix of synthetic CoT datasets and manually pre-processed logical reasoning prompts.
|
| 63 |
+
Fine-tuning: Performed on Kaggle GPUs using PEFT (LoRA) with a focus on preserving the base model's knowledge while injecting structured logic.
|
| 64 |
+
Optimization: Mixed precision (fp16) with a final merge_and_unload for seamless deployment.
|
| 65 |
+
|
| 66 |
+
📊 Evaluation Results
|
| 67 |
+
In qualitative testing, DogeAI-v2.0-4B shows:
|
| 68 |
+
Higher Logical Consistency compared to the stock Qwen3-4B-Base.
|
| 69 |
+
Reduced Hallucination in multi-step word problems.
|
| 70 |
+
Structured Verbosity: It "thinks" before it answers.
|
| 71 |
+
|
| 72 |
+
⚠️ Limitations & Bias
|
| 73 |
+
Reasoning Loops: The model might occasionally over-explain simple tasks.
|
| 74 |
+
Safety: No specific safety RLHF has been applied. Use with external safety guardrails in production.
|
| 75 |
+
Factuality: While logic is improved, it can still hallucinate complex facts.
|
| 76 |
+
|
| 77 |
+
🤝 Contact & Collaboration
|
| 78 |
+
Developed with ❤️ by AxionLab-Co.
|
| 79 |
+
We are an independent, community-driven lab focused on efficient AI.
|
| 80 |
+
Organization: AxionLab-official
|
| 81 |
+
Feedback: Open a Discussion on this repo!
|
| 82 |
+
Language Support: Primarily English. Portuguese support is available but may vary in reasoning depth.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|