Update README.md
Browse files
README.md
CHANGED
|
@@ -10,16 +10,15 @@ pipeline_tag: text-generation
|
|
| 10 |
datasets:
|
| 11 |
- nvidia/OpenMathReasoning
|
| 12 |
---
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
-
|
| 15 |
-
# 📌 Model Details
|
| 16 |
-
**Model Description**
|
| 17 |
-
|
| 18 |
-
DogeAI-v2.0-4B-Reasoning é um modelo de linguagem focado em raciocínio, pensamento estruturado e respostas analíticas, criado a partir do merge de uma LoRA de reasoning sobre o modelo base Qwen3-4B-Base.
|
| 19 |
|
| 20 |
-
|
| 21 |
|
| 22 |
-
|
| 23 |
|
| 24 |
Developed by: AxionLab-Co
|
| 25 |
|
|
@@ -35,8 +34,7 @@ License: Apache 2.0 (inherits from base model)
|
|
| 35 |
|
| 36 |
Finetuned from model: Qwen3-4B-Base
|
| 37 |
|
| 38 |
-
|
| 39 |
-
|
| 40 |
Repository: Hugging Face – AxionLab-Co/DogeAI-v2.0-4B-Reasoning
|
| 41 |
|
| 42 |
Base Model: Qwen/Qwen3-4B-Base
|
|
@@ -45,99 +43,86 @@ Training Platform: Kaggle
|
|
| 45 |
|
| 46 |
Frameworks: PyTorch, Transformers, PEFT
|
| 47 |
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
Este modelo pode ser utilizado diretamente para:
|
| 52 |
|
| 53 |
-
|
| 54 |
|
| 55 |
-
|
| 56 |
|
| 57 |
-
|
| 58 |
|
| 59 |
-
|
| 60 |
|
| 61 |
Downstream Use
|
| 62 |
|
| 63 |
-
Conversational agents
|
| 64 |
|
| 65 |
-
|
| 66 |
|
| 67 |
-
|
| 68 |
|
| 69 |
-
|
| 70 |
|
| 71 |
Out-of-Scope Use
|
| 72 |
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
Decisões médicas, legais ou financeiras
|
| 76 |
|
| 77 |
-
|
| 78 |
|
| 79 |
-
|
| 80 |
|
| 81 |
-
|
| 82 |
|
| 83 |
-
|
|
|
|
| 84 |
|
| 85 |
-
|
| 86 |
|
| 87 |
-
|
| 88 |
|
| 89 |
-
|
| 90 |
|
| 91 |
Recommendations
|
| 92 |
|
| 93 |
-
|
| 94 |
|
| 95 |
-
|
| 96 |
|
| 97 |
-
|
| 98 |
|
| 99 |
-
|
| 100 |
|
| 101 |
-
|
| 102 |
'' from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 103 |
|
|
|
|
| 104 |
|
| 105 |
-
|
| 106 |
-
"AxionLab-Co/DogeAI-v2.0-4B-Reasoning",
|
| 107 |
-
device_map="auto",
|
| 108 |
-
torch_dtype="auto"
|
| 109 |
-
)
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
tokenizer = AutoTokenizer.from_pretrained(
|
| 113 |
-
"AxionLab-Co/DogeAI-v2.0-4B-Reasoning"
|
| 114 |
-
)
|
| 115 |
-
|
| 116 |
-
|
| 117 |
-
inputs = tokenizer("Solve this step by step:", return_tensors="pt").to(model.device)
|
| 118 |
-
outputs = model.generate(**inputs, max_new_tokens=256)
|
| 119 |
|
|
|
|
| 120 |
|
| 121 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ''
|
| 122 |
-
|
|
|
|
| 123 |
Training Data
|
| 124 |
|
| 125 |
-
|
| 126 |
|
| 127 |
-
|
| 128 |
|
| 129 |
-
|
| 130 |
|
| 131 |
-
|
| 132 |
|
| 133 |
-
|
| 134 |
|
| 135 |
-
Training Procedure
|
| 136 |
-
Preprocessing
|
| 137 |
|
| 138 |
-
|
| 139 |
|
| 140 |
-
|
| 141 |
|
| 142 |
Training Hyperparameters
|
| 143 |
|
|
@@ -151,44 +136,42 @@ Framework: Transformers + PEFT
|
|
| 151 |
|
| 152 |
Speeds, Sizes, Times
|
| 153 |
|
| 154 |
-
|
| 155 |
|
| 156 |
-
LoRA
|
| 157 |
|
| 158 |
-
|
| 159 |
|
| 160 |
-
|
| 161 |
-
Testing Data, Factors & Metrics
|
| 162 |
-
Testing Data
|
| 163 |
|
| 164 |
-
|
| 165 |
|
| 166 |
-
|
| 167 |
|
| 168 |
Factors
|
| 169 |
|
| 170 |
-
|
| 171 |
|
| 172 |
-
|
| 173 |
|
| 174 |
-
|
| 175 |
|
| 176 |
Metrics
|
| 177 |
|
| 178 |
-
|
| 179 |
|
| 180 |
-
|
| 181 |
|
| 182 |
Results
|
| 183 |
|
| 184 |
-
|
| 185 |
|
| 186 |
Summary
|
| 187 |
|
| 188 |
-
DogeAI-v2.0-4B-Reasoning
|
| 189 |
-
|
| 190 |
-
# 🌱 Environmental Impact
|
| 191 |
|
|
|
|
| 192 |
Hardware Type: NVIDIA GPU (Kaggle)
|
| 193 |
|
| 194 |
Hours used: Few hours (single-session fine-tuning + merge)
|
|
@@ -199,15 +182,13 @@ Compute Region: Unknown
|
|
| 199 |
|
| 200 |
Carbon Emitted: Not measured
|
| 201 |
|
| 202 |
-
|
| 203 |
-
|
| 204 |
-
|
| 205 |
Decoder-only Transformer
|
| 206 |
|
| 207 |
-
|
| 208 |
|
| 209 |
-
Compute Infrastructure
|
| 210 |
-
Hardware
|
| 211 |
|
| 212 |
NVIDIA GPU (Kaggle environment)
|
| 213 |
|
|
@@ -219,30 +200,24 @@ Transformers
|
|
| 219 |
|
| 220 |
PEFT 0.18.1
|
| 221 |
|
| 222 |
-
|
| 223 |
-
|
| 224 |
-
Se você utilizar este modelo em pesquisas ou projetos derivados, cite o modelo base e este repositório.
|
| 225 |
-
|
| 226 |
-
# 👥 Model Card Authors
|
| 227 |
|
|
|
|
| 228 |
AxionLab-Co
|
| 229 |
|
| 230 |
-
|
| 231 |
-
|
| 232 |
-
|
| 233 |
-
AxionLab-Co – Hugging Face
|
| 234 |
-
|
| 235 |
-
## -- *FOR ENGLISH READERS* --
|
| 236 |
-
|
| 237 |
# 🧠 DogeAI-v2.0-4B-Reasoning
|
| 238 |
# 📌 Model Details
|
| 239 |
**Model Description**
|
| 240 |
|
| 241 |
-
DogeAI-v2.0-4B-Reasoning
|
| 242 |
|
| 243 |
-
|
| 244 |
|
| 245 |
-
|
| 246 |
|
| 247 |
Developed by: AxionLab-Co
|
| 248 |
|
|
@@ -269,72 +244,71 @@ Training Platform: Kaggle
|
|
| 269 |
Frameworks: PyTorch, Transformers, PEFT
|
| 270 |
|
| 271 |
# 🎯 Uses
|
| 272 |
-
|
| 273 |
# Direct Use
|
| 274 |
|
| 275 |
-
|
| 276 |
|
| 277 |
-
|
| 278 |
|
| 279 |
-
|
| 280 |
|
| 281 |
-
|
| 282 |
|
| 283 |
-
|
| 284 |
|
| 285 |
Downstream Use
|
| 286 |
|
| 287 |
-
Conversational agents
|
| 288 |
|
| 289 |
-
|
| 290 |
|
| 291 |
-
|
| 292 |
|
| 293 |
-
|
| 294 |
|
| 295 |
Out-of-Scope Use
|
| 296 |
|
| 297 |
-
|
| 298 |
|
| 299 |
-
|
| 300 |
|
| 301 |
-
|
| 302 |
|
| 303 |
-
|
| 304 |
|
| 305 |
# ⚠️ Bias, Risks, and Limitations
|
| 306 |
|
| 307 |
-
|
| 308 |
|
| 309 |
-
|
| 310 |
|
| 311 |
-
|
| 312 |
|
| 313 |
-
|
| 314 |
|
| 315 |
Recommendations
|
| 316 |
|
| 317 |
-
|
| 318 |
|
| 319 |
-
|
| 320 |
|
| 321 |
-
|
| 322 |
|
| 323 |
-
|
| 324 |
|
| 325 |
# 🚀 How to Get Started with the Model
|
| 326 |
'' from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 327 |
|
| 328 |
|
| 329 |
-
model = AutoModelForCausalLM.from_pretrained(
|
| 330 |
-
"AxionLab-Co/DogeAI-v2.0-4B-Reasoning",
|
| 331 |
-
device_map="auto",
|
| 332 |
-
torch_dtype="auto"
|
| 333 |
)
|
| 334 |
|
| 335 |
|
| 336 |
-
tokenizer = AutoTokenizer.from_pretrained(
|
| 337 |
-
"AxionLab-Co/DogeAI-v2.0-4B-Reasoning"
|
| 338 |
)
|
| 339 |
|
| 340 |
|
|
@@ -343,25 +317,25 @@ outputs = model.generate(**inputs, max_new_tokens=256)
|
|
| 343 |
|
| 344 |
|
| 345 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ''
|
| 346 |
-
# 🏋️Training Details
|
| 347 |
Training Data
|
| 348 |
|
| 349 |
-
|
| 350 |
|
| 351 |
-
|
| 352 |
|
| 353 |
-
|
| 354 |
|
| 355 |
-
|
| 356 |
|
| 357 |
-
|
| 358 |
|
| 359 |
Training Procedure
|
| 360 |
Preprocessing
|
| 361 |
|
| 362 |
-
|
| 363 |
|
| 364 |
-
|
| 365 |
|
| 366 |
Training Hyperparameters
|
| 367 |
|
|
@@ -375,41 +349,41 @@ Framework: Transformers + PEFT
|
|
| 375 |
|
| 376 |
Speeds, Sizes, Times
|
| 377 |
|
| 378 |
-
|
| 379 |
|
| 380 |
-
LoRA
|
| 381 |
|
| 382 |
-
|
| 383 |
|
| 384 |
# 📊 Evaluation
|
| 385 |
Testing Data, Factors & Metrics
|
| 386 |
Testing Data
|
| 387 |
|
| 388 |
-
|
| 389 |
|
| 390 |
-
|
| 391 |
|
| 392 |
Factors
|
| 393 |
|
| 394 |
-
|
| 395 |
|
| 396 |
-
|
| 397 |
|
| 398 |
-
|
| 399 |
|
| 400 |
Metrics
|
| 401 |
|
| 402 |
-
|
| 403 |
|
| 404 |
-
|
| 405 |
|
| 406 |
Results
|
| 407 |
|
| 408 |
-
|
| 409 |
|
| 410 |
Summary
|
| 411 |
|
| 412 |
-
DogeAI-v2.0-4B-Reasoning
|
| 413 |
|
| 414 |
# 🌱 Environmental Impact
|
| 415 |
|
|
@@ -428,7 +402,7 @@ Carbon Emitted: Not measured
|
|
| 428 |
|
| 429 |
Decoder-only Transformer
|
| 430 |
|
| 431 |
-
|
| 432 |
|
| 433 |
Compute Infrastructure
|
| 434 |
Hardware
|
|
@@ -445,7 +419,7 @@ PEFT 0.18.1
|
|
| 445 |
|
| 446 |
# 📚 Citation
|
| 447 |
|
| 448 |
-
|
| 449 |
|
| 450 |
# 👥 Model Card Authors
|
| 451 |
|
|
@@ -453,4 +427,5 @@ AxionLab-Co
|
|
| 453 |
|
| 454 |
# 📬 Model Card Contact
|
| 455 |
|
| 456 |
-
|
|
|
|
|
|
| 10 |
datasets:
|
| 11 |
- nvidia/OpenMathReasoning
|
| 12 |
---
|
| 13 |
+
🧠 DogeAI-v2.0-4B-Reasoning
|
| 14 |
+
📌 Model Details
|
| 15 |
+
Model Description
|
| 16 |
|
| 17 |
+
DogeAI-v2.0-4B-Reasoning is a language model focused on reasoning, structured thinking, and analytical responses, created from merging a reasoning LoRA onto the Qwen3-4B-Base model.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
+
The main objective of this model is to improve logical coherence, the ability to solve problems in multiple steps, and explanatory clarity, without drastically altering the overall behavior of the base model.
|
| 20 |
|
| 21 |
+
This model represents the merged and final version, and can be used without dependence on external LoRA.
|
| 22 |
|
| 23 |
Developed by: AxionLab-Co
|
| 24 |
|
|
|
|
| 34 |
|
| 35 |
Finetuned from model: Qwen3-4B-Base
|
| 36 |
|
| 37 |
+
🔗 Model Sources
|
|
|
|
| 38 |
Repository: Hugging Face – AxionLab-Co/DogeAI-v2.0-4B-Reasoning
|
| 39 |
|
| 40 |
Base Model: Qwen/Qwen3-4B-Base
|
|
|
|
| 43 |
|
| 44 |
Frameworks: PyTorch, Transformers, PEFT
|
| 45 |
|
| 46 |
+
🎯 Uses
|
| 47 |
+
Direct Use
|
| 48 |
+
This model can be used directly for:
|
|
|
|
| 49 |
|
| 50 |
+
Logical and analytical reasoning
|
| 51 |
|
| 52 |
+
Multi-step problem solving
|
| 53 |
|
| 54 |
+
Detailed explanations (“Thinking-Style Responses”)
|
| 55 |
|
| 56 |
+
AI Research, Experimentation, and Learning
|
| 57 |
|
| 58 |
Downstream Use
|
| 59 |
|
| 60 |
+
Conversational agents focused on reasoning
|
| 61 |
|
| 62 |
+
Additional fine-tuning in specific domains
|
| 63 |
|
| 64 |
+
Conversion to GGUF and use in engines like llama.cpp
|
| 65 |
|
| 66 |
+
Academic or experimental research
|
| 67 |
|
| 68 |
Out-of-Scope Use
|
| 69 |
|
| 70 |
+
This model is not recommended for:
|
|
|
|
|
|
|
| 71 |
|
| 72 |
+
Medical, legal, or financial decisions
|
| 73 |
|
| 74 |
+
Critical safety applications
|
| 75 |
|
| 76 |
+
Use where absolute factuality is mandatory
|
| 77 |
|
| 78 |
+
⚠️ Bias, Risks, and Limitations
|
| 79 |
+
May generate excessive reasoning chains, even when unnecessary
|
| 80 |
|
| 81 |
+
Inherited potential biases from the base model and training data
|
| 82 |
|
| 83 |
+
Has not undergone specific alignment or safety fine-tuning
|
| 84 |
|
| 85 |
+
Generated reasoning is not guaranteed to be correct
|
| 86 |
|
| 87 |
Recommendations
|
| 88 |
|
| 89 |
+
Users should:
|
| 90 |
|
| 91 |
+
Critically evaluate responses
|
| 92 |
|
| 93 |
+
Use additional layers of security in production
|
| 94 |
|
| 95 |
+
Avoid blindly trusting chains of reasoning
|
| 96 |
|
| 97 |
+
🚀 How to Get Started with the Model
|
| 98 |
'' from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 99 |
|
| 100 |
+
model = AutoModelForCausalLM.from_pretrained( "AxionLab-Co/DogeAI-v2.0-4B-Reasoning", device_map="auto", torch_dtype="auto" )
|
| 101 |
|
| 102 |
+
tokenizer = AutoTokenizer.from_pretrained( "AxionLab-Co/DogeAI-v2.0-4B-Reasoning" )
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 103 |
|
| 104 |
+
inputs = tokenizer("Solve this step by step:", return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=256)
|
| 105 |
|
| 106 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ''
|
| 107 |
+
|
| 108 |
+
🏋️Training Details
|
| 109 |
Training Data
|
| 110 |
|
| 111 |
+
The model was fitted using datasets focused on reasoning and chain-of-thought, containing:
|
| 112 |
|
| 113 |
+
Step-by-step problem solving
|
| 114 |
|
| 115 |
+
Structured explanatory responses
|
| 116 |
|
| 117 |
+
Synthetic and curated analytical prompts
|
| 118 |
|
| 119 |
+
The data were manually pre-processed to improve quality and consistency.
|
| 120 |
|
| 121 |
+
Training Procedure Preprocessing
|
|
|
|
| 122 |
|
| 123 |
+
Tokenization with Qwen's original tokenizer
|
| 124 |
|
| 125 |
+
Filtering of inconsistent or low-quality examples
|
| 126 |
|
| 127 |
Training Hyperparameters
|
| 128 |
|
|
|
|
| 136 |
|
| 137 |
Speeds, Sizes, Times
|
| 138 |
|
| 139 |
+
Training performed on Kaggle GPU
|
| 140 |
|
| 141 |
+
LoRA intentionally kept lightweight
|
| 142 |
|
| 143 |
+
Final merge performed via PEFT (merge_and_unload)
|
| 144 |
|
| 145 |
+
📊 Evaluation
|
| 146 |
+
Testing Data, Factors & Metrics Testing Data
|
|
|
|
| 147 |
|
| 148 |
+
Manual reasoning prompts
|
| 149 |
|
| 150 |
+
Direct comparison with the base model
|
| 151 |
|
| 152 |
Factors
|
| 153 |
|
| 154 |
+
Clarity of reasoning
|
| 155 |
|
| 156 |
+
Logical coherence
|
| 157 |
|
| 158 |
+
Tendency to hallucination
|
| 159 |
|
| 160 |
Metrics
|
| 161 |
|
| 162 |
+
Qualitative human evaluation
|
| 163 |
|
| 164 |
+
Subjective comparison of responses
|
| 165 |
|
| 166 |
Results
|
| 167 |
|
| 168 |
+
The model demonstrates better logical organization and more concise explanations Consistent in direct comparison with Qwen3-4B-Base.
|
| 169 |
|
| 170 |
Summary
|
| 171 |
|
| 172 |
+
DogeAI-v2.0-4B-Reasoning prioritizes quality of thought, not just textual fluency.
|
|
|
|
|
|
|
| 173 |
|
| 174 |
+
🌱 Environmental Impact
|
| 175 |
Hardware Type: NVIDIA GPU (Kaggle)
|
| 176 |
|
| 177 |
Hours used: Few hours (single-session fine-tuning + merge)
|
|
|
|
| 182 |
|
| 183 |
Carbon Emitted: Not measured
|
| 184 |
|
| 185 |
+
⚙️ Technical Specifications
|
| 186 |
+
Model Architecture and Objective
|
|
|
|
| 187 |
Decoder-only Transformer
|
| 188 |
|
| 189 |
+
Objective: Improve reasoning via efficient fine-tuning
|
| 190 |
|
| 191 |
+
Compute Infrastructure Hardware
|
|
|
|
| 192 |
|
| 193 |
NVIDIA GPU (Kaggle environment)
|
| 194 |
|
|
|
|
| 200 |
|
| 201 |
PEFT 0.18.1
|
| 202 |
|
| 203 |
+
📚 Citation
|
| 204 |
+
If you use this model in research or derivative projects, please cite the base model and this repository.
|
|
|
|
|
|
|
|
|
|
| 205 |
|
| 206 |
+
👥 Model Card Authors
|
| 207 |
AxionLab-Co
|
| 208 |
|
| 209 |
+
📬 Model Card Contact
|
| 210 |
+
For questions, feedback, or collaboration: AxionLab-Co – Hugging Face
|
| 211 |
+
# --FOR PORTUGUESE READERS --
|
|
|
|
|
|
|
|
|
|
|
|
|
| 212 |
# 🧠 DogeAI-v2.0-4B-Reasoning
|
| 213 |
# 📌 Model Details
|
| 214 |
**Model Description**
|
| 215 |
|
| 216 |
+
DogeAI-v2.0-4B-Reasoning é um modelo de linguagem focado em raciocínio, pensamento estruturado e respostas analíticas, criado a partir do merge de uma LoRA de reasoning sobre o modelo base Qwen3-4B-Base.
|
| 217 |
|
| 218 |
+
O objetivo principal deste modelo é melhorar a coerência lógica, a capacidade de resolver problemas em múltiplos passos e a clareza explicativa, sem alterar drasticamente o comportamento geral do modelo base.
|
| 219 |
|
| 220 |
+
Este modelo representa a versão merged e final, podendo ser utilizado sem dependência de LoRA externa.
|
| 221 |
|
| 222 |
Developed by: AxionLab-Co
|
| 223 |
|
|
|
|
| 244 |
Frameworks: PyTorch, Transformers, PEFT
|
| 245 |
|
| 246 |
# 🎯 Uses
|
|
|
|
| 247 |
# Direct Use
|
| 248 |
|
| 249 |
+
Este modelo pode ser utilizado diretamente para:
|
| 250 |
|
| 251 |
+
Raciocínio lógico e analítico
|
| 252 |
|
| 253 |
+
Resolução de problemas em múltiplos passos
|
| 254 |
|
| 255 |
+
Explicações detalhadas (“thinking-style responses”)
|
| 256 |
|
| 257 |
+
Pesquisa, experimentação e aprendizado em IA
|
| 258 |
|
| 259 |
Downstream Use
|
| 260 |
|
| 261 |
+
Conversational agents focados em reasoning
|
| 262 |
|
| 263 |
+
Fine-tuning adicional em domínios específicos
|
| 264 |
|
| 265 |
+
Conversão para GGUF e uso em engines como llama.cpp
|
| 266 |
|
| 267 |
+
Pesquisa acadêmica ou experimental
|
| 268 |
|
| 269 |
Out-of-Scope Use
|
| 270 |
|
| 271 |
+
Este modelo não é recomendado para:
|
| 272 |
|
| 273 |
+
Decisões médicas, legais ou financeiras
|
| 274 |
|
| 275 |
+
Aplicações críticas de segurança
|
| 276 |
|
| 277 |
+
Uso onde factualidade absoluta é obrigatória
|
| 278 |
|
| 279 |
# ⚠️ Bias, Risks, and Limitations
|
| 280 |
|
| 281 |
+
Pode gerar cadeias de raciocínio excessivas, mesmo quando não necessárias
|
| 282 |
|
| 283 |
+
Herdou possíveis vieses do modelo base e dos dados de treino
|
| 284 |
|
| 285 |
+
Não passou por fine-tuning específico de alinhamento ou safety
|
| 286 |
|
| 287 |
+
Raciocínios gerados não são garantidamente corretos
|
| 288 |
|
| 289 |
Recommendations
|
| 290 |
|
| 291 |
+
Usuários devem:
|
| 292 |
|
| 293 |
+
Avaliar criticamente as respostas
|
| 294 |
|
| 295 |
+
Utilizar camadas adicionais de segurança em produção
|
| 296 |
|
| 297 |
+
Evitar confiar cegamente em cadeias de raciocínio
|
| 298 |
|
| 299 |
# 🚀 How to Get Started with the Model
|
| 300 |
'' from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 301 |
|
| 302 |
|
| 303 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 304 |
+
"AxionLab-Co/DogeAI-v2.0-4B-Reasoning",
|
| 305 |
+
device_map="auto",
|
| 306 |
+
torch_dtype="auto"
|
| 307 |
)
|
| 308 |
|
| 309 |
|
| 310 |
+
tokenizer = AutoTokenizer.from_pretrained(
|
| 311 |
+
"AxionLab-Co/DogeAI-v2.0-4B-Reasoning"
|
| 312 |
)
|
| 313 |
|
| 314 |
|
|
|
|
| 317 |
|
| 318 |
|
| 319 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ''
|
| 320 |
+
# 🏋️ Training Details
|
| 321 |
Training Data
|
| 322 |
|
| 323 |
+
O modelo foi ajustado utilizando datasets focados em reasoning e chain-of-thought, contendo:
|
| 324 |
|
| 325 |
+
Resolução passo a passo de problemas
|
| 326 |
|
| 327 |
+
Respostas explicativas estruturadas
|
| 328 |
|
| 329 |
+
Prompts analíticos sintéticos e curados
|
| 330 |
|
| 331 |
+
Os dados foram pré-processados manualmente para melhorar qualidade e consistência.
|
| 332 |
|
| 333 |
Training Procedure
|
| 334 |
Preprocessing
|
| 335 |
|
| 336 |
+
Tokenização com tokenizer original do Qwen
|
| 337 |
|
| 338 |
+
Filtragem de exemplos inconsistentes ou de baixa qualidade
|
| 339 |
|
| 340 |
Training Hyperparameters
|
| 341 |
|
|
|
|
| 349 |
|
| 350 |
Speeds, Sizes, Times
|
| 351 |
|
| 352 |
+
Treinamento realizado em GPU do Kaggle
|
| 353 |
|
| 354 |
+
LoRA mantida propositalmente leve
|
| 355 |
|
| 356 |
+
Merge final realizado via PEFT (merge_and_unload)
|
| 357 |
|
| 358 |
# 📊 Evaluation
|
| 359 |
Testing Data, Factors & Metrics
|
| 360 |
Testing Data
|
| 361 |
|
| 362 |
+
Prompts manuais de reasoning
|
| 363 |
|
| 364 |
+
Comparação direta com o modelo base
|
| 365 |
|
| 366 |
Factors
|
| 367 |
|
| 368 |
+
Clareza do raciocínio
|
| 369 |
|
| 370 |
+
Coerência lógica
|
| 371 |
|
| 372 |
+
Tendência a alucinação
|
| 373 |
|
| 374 |
Metrics
|
| 375 |
|
| 376 |
+
Avaliação qualitativa humana
|
| 377 |
|
| 378 |
+
Comparação subjetiva de respostas
|
| 379 |
|
| 380 |
Results
|
| 381 |
|
| 382 |
+
O modelo demonstra melhor organização lógica e explicações mais consistentes em comparação direta com o Qwen3-4B-Base.
|
| 383 |
|
| 384 |
Summary
|
| 385 |
|
| 386 |
+
DogeAI-v2.0-4B-Reasoning prioriza qualidade de pensamento, não apenas fluência textual.
|
| 387 |
|
| 388 |
# 🌱 Environmental Impact
|
| 389 |
|
|
|
|
| 402 |
|
| 403 |
Decoder-only Transformer
|
| 404 |
|
| 405 |
+
Objetivo: melhorar raciocínio via fine-tuning eficiente
|
| 406 |
|
| 407 |
Compute Infrastructure
|
| 408 |
Hardware
|
|
|
|
| 419 |
|
| 420 |
# 📚 Citation
|
| 421 |
|
| 422 |
+
Se você utilizar este modelo em pesquisas ou projetos derivados, cite o modelo base e este repositório.
|
| 423 |
|
| 424 |
# 👥 Model Card Authors
|
| 425 |
|
|
|
|
| 427 |
|
| 428 |
# 📬 Model Card Contact
|
| 429 |
|
| 430 |
+
Para dúvidas, feedback ou colaboração:
|
| 431 |
+
AxionLab-Co – Hugging Face
|