Update README.md
Browse files
README.md
CHANGED
|
@@ -7,6 +7,8 @@ language:
|
|
| 7 |
base_model:
|
| 8 |
- unsloth/Qwen3-4B-Base
|
| 9 |
pipeline_tag: text-generation
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
|
| 12 |
# 🧠 DogeAI-v2.0-4B-Reasoning
|
|
@@ -228,4 +230,227 @@ AxionLab-Co
|
|
| 228 |
# 📬 Model Card Contact
|
| 229 |
|
| 230 |
Para dúvidas, feedback ou colaboração:
|
| 231 |
-
AxionLab-Co – Hugging Face
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
base_model:
|
| 8 |
- unsloth/Qwen3-4B-Base
|
| 9 |
pipeline_tag: text-generation
|
| 10 |
+
datasets:
|
| 11 |
+
- nvidia/OpenMathReasoning
|
| 12 |
---
|
| 13 |
|
| 14 |
# 🧠 DogeAI-v2.0-4B-Reasoning
|
|
|
|
| 230 |
# 📬 Model Card Contact
|
| 231 |
|
| 232 |
Para dúvidas, feedback ou colaboração:
|
| 233 |
+
AxionLab-Co – Hugging Face
|
| 234 |
+
|
| 235 |
+
## -- *FOR ENGLISH READERS* --
|
| 236 |
+
|
| 237 |
+
# 🧠 DogeAI-v2.0-4B-Reasoning
|
| 238 |
+
# 📌 Model Details
|
| 239 |
+
**Model Description**
|
| 240 |
+
|
| 241 |
+
DogeAI-v2.0-4B-Reasoning is a language model focused on reasoning, structured thinking, and analytical responses, created from merging a reasoning LoRA onto the Qwen3-4B-Base model.
|
| 242 |
+
|
| 243 |
+
The main objective of this model is to improve logical coherence, the ability to solve problems in multiple steps, and explanatory clarity, without drastically altering the overall behavior of the base model.
|
| 244 |
+
|
| 245 |
+
This model represents the merged and final version, and can be used without dependence on external LoRA.
|
| 246 |
+
|
| 247 |
+
Developed by: AxionLab-Co
|
| 248 |
+
|
| 249 |
+
Funded by: Independent / Community-driven
|
| 250 |
+
|
| 251 |
+
Shared by: AxionLab-Co
|
| 252 |
+
|
| 253 |
+
Model type: Decoder-only Transformer (Causal Language Model)
|
| 254 |
+
|
| 255 |
+
Language(s) (NLP): Primarily English
|
| 256 |
+
|
| 257 |
+
License: Apache 2.0 (inherits from base model)
|
| 258 |
+
|
| 259 |
+
Finetuned from model: Qwen3-4B-Base
|
| 260 |
+
|
| 261 |
+
# 🔗 Model Sources
|
| 262 |
+
|
| 263 |
+
Repository: Hugging Face – AxionLab-Co/DogeAI-v2.0-4B-Reasoning
|
| 264 |
+
|
| 265 |
+
Base Model: Qwen/Qwen3-4B-Base
|
| 266 |
+
|
| 267 |
+
Training Platform: Kaggle
|
| 268 |
+
|
| 269 |
+
Frameworks: PyTorch, Transformers, PEFT
|
| 270 |
+
|
| 271 |
+
# 🎯 Uses
|
| 272 |
+
|
| 273 |
+
# Direct Use
|
| 274 |
+
|
| 275 |
+
This model can be used directly for:
|
| 276 |
+
|
| 277 |
+
Logical and analytical reasoning
|
| 278 |
+
|
| 279 |
+
Multi-step problem solving
|
| 280 |
+
|
| 281 |
+
Detailed explanations (“Thinking-Style Responses”)
|
| 282 |
+
|
| 283 |
+
AI Research, Experimentation, and Learning
|
| 284 |
+
|
| 285 |
+
Downstream Use
|
| 286 |
+
|
| 287 |
+
Conversational agents focused on reasoning
|
| 288 |
+
|
| 289 |
+
Additional fine-tuning in specific domains
|
| 290 |
+
|
| 291 |
+
Conversion to GGUF and use in engines like llama.cpp
|
| 292 |
+
|
| 293 |
+
Academic or experimental research
|
| 294 |
+
|
| 295 |
+
Out-of-Scope Use
|
| 296 |
+
|
| 297 |
+
This model is not recommended for:
|
| 298 |
+
|
| 299 |
+
Medical, legal, or financial decisions
|
| 300 |
+
|
| 301 |
+
Critical safety applications
|
| 302 |
+
|
| 303 |
+
Use where absolute factuality is mandatory
|
| 304 |
+
|
| 305 |
+
# ⚠️ Bias, Risks, and Limitations
|
| 306 |
+
|
| 307 |
+
May generate excessive reasoning chains, even when unnecessary
|
| 308 |
+
|
| 309 |
+
Inherited potential biases from the base model and training data
|
| 310 |
+
|
| 311 |
+
Has not undergone specific alignment or safety fine-tuning
|
| 312 |
+
|
| 313 |
+
Generated reasoning is not guaranteed to be correct
|
| 314 |
+
|
| 315 |
+
Recommendations
|
| 316 |
+
|
| 317 |
+
Users should:
|
| 318 |
+
|
| 319 |
+
Critically evaluate responses
|
| 320 |
+
|
| 321 |
+
Use additional layers of security in production
|
| 322 |
+
|
| 323 |
+
Avoid blindly trusting chains of reasoning
|
| 324 |
+
|
| 325 |
+
# 🚀 How to Get Started with the Model
|
| 326 |
+
'' from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 327 |
+
|
| 328 |
+
|
| 329 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 330 |
+
"AxionLab-Co/DogeAI-v2.0-4B-Reasoning",
|
| 331 |
+
device_map="auto",
|
| 332 |
+
torch_dtype="auto"
|
| 333 |
+
)
|
| 334 |
+
|
| 335 |
+
|
| 336 |
+
tokenizer = AutoTokenizer.from_pretrained(
|
| 337 |
+
"AxionLab-Co/DogeAI-v2.0-4B-Reasoning"
|
| 338 |
+
)
|
| 339 |
+
|
| 340 |
+
|
| 341 |
+
inputs = tokenizer("Solve this step by step:", return_tensors="pt").to(model.device)
|
| 342 |
+
outputs = model.generate(**inputs, max_new_tokens=256)
|
| 343 |
+
|
| 344 |
+
|
| 345 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ''
|
| 346 |
+
# 🏋️Training Details
|
| 347 |
+
Training Data
|
| 348 |
+
|
| 349 |
+
The model was fitted using datasets focused on reasoning and chain-of-thought, containing:
|
| 350 |
+
|
| 351 |
+
Step-by-step problem solving
|
| 352 |
+
|
| 353 |
+
Structured explanatory responses
|
| 354 |
+
|
| 355 |
+
Synthetic and curated analytical prompts
|
| 356 |
+
|
| 357 |
+
The data were manually pre-processed to improve quality and consistency.
|
| 358 |
+
|
| 359 |
+
Training Procedure
|
| 360 |
+
Preprocessing
|
| 361 |
+
|
| 362 |
+
Tokenization with Qwen's original tokenizer
|
| 363 |
+
|
| 364 |
+
Filtering of inconsistent or low-quality examples
|
| 365 |
+
|
| 366 |
+
Training Hyperparameters
|
| 367 |
+
|
| 368 |
+
Training regime: fp16 mixed precision
|
| 369 |
+
|
| 370 |
+
Fine-tuning method: LoRA (PEFT)
|
| 371 |
+
|
| 372 |
+
Optimizer: AdamW
|
| 373 |
+
|
| 374 |
+
Framework: Transformers + PEFT
|
| 375 |
+
|
| 376 |
+
Speeds, Sizes, Times
|
| 377 |
+
|
| 378 |
+
Training performed on Kaggle GPU
|
| 379 |
+
|
| 380 |
+
LoRA intentionally kept lightweight
|
| 381 |
+
|
| 382 |
+
Final merge performed via PEFT (merge_and_unload)
|
| 383 |
+
|
| 384 |
+
# 📊 Evaluation
|
| 385 |
+
Testing Data, Factors & Metrics
|
| 386 |
+
Testing Data
|
| 387 |
+
|
| 388 |
+
Manual reasoning prompts
|
| 389 |
+
|
| 390 |
+
Direct comparison with the base model
|
| 391 |
+
|
| 392 |
+
Factors
|
| 393 |
+
|
| 394 |
+
Clarity of reasoning
|
| 395 |
+
|
| 396 |
+
Logical coherence
|
| 397 |
+
|
| 398 |
+
Tendency to hallucination
|
| 399 |
+
|
| 400 |
+
Metrics
|
| 401 |
+
|
| 402 |
+
Qualitative human evaluation
|
| 403 |
+
|
| 404 |
+
Subjective comparison of responses
|
| 405 |
+
|
| 406 |
+
Results
|
| 407 |
+
|
| 408 |
+
The model demonstrates better logical organization and more concise explanations Consistent in direct comparison with Qwen3-4B-Base.
|
| 409 |
+
|
| 410 |
+
Summary
|
| 411 |
+
|
| 412 |
+
DogeAI-v2.0-4B-Reasoning prioritizes quality of thought, not just textual fluency.
|
| 413 |
+
|
| 414 |
+
# 🌱 Environmental Impact
|
| 415 |
+
|
| 416 |
+
Hardware Type: NVIDIA GPU (Kaggle)
|
| 417 |
+
|
| 418 |
+
Hours used: Few hours (single-session fine-tuning + merge)
|
| 419 |
+
|
| 420 |
+
Cloud Provider: Kaggle
|
| 421 |
+
|
| 422 |
+
Compute Region: Unknown
|
| 423 |
+
|
| 424 |
+
Carbon Emitted: Not measured
|
| 425 |
+
|
| 426 |
+
# ⚙️ Technical Specifications
|
| 427 |
+
# Model Architecture and Objective
|
| 428 |
+
|
| 429 |
+
Decoder-only Transformer
|
| 430 |
+
|
| 431 |
+
Objective: Improve reasoning via efficient fine-tuning
|
| 432 |
+
|
| 433 |
+
Compute Infrastructure
|
| 434 |
+
Hardware
|
| 435 |
+
|
| 436 |
+
NVIDIA GPU (Kaggle environment)
|
| 437 |
+
|
| 438 |
+
Software
|
| 439 |
+
|
| 440 |
+
PyTorch
|
| 441 |
+
|
| 442 |
+
Transformers
|
| 443 |
+
|
| 444 |
+
PEFT 0.18.1
|
| 445 |
+
|
| 446 |
+
# 📚 Citation
|
| 447 |
+
|
| 448 |
+
If you use this model in research or derivative projects, please cite the base model and this repository.
|
| 449 |
+
|
| 450 |
+
# 👥 Model Card Authors
|
| 451 |
+
|
| 452 |
+
AxionLab-Co
|
| 453 |
+
|
| 454 |
+
# 📬 Model Card Contact
|
| 455 |
+
|
| 456 |
+
For questions, feedback, or collaboration: AxionLab-Co – Hugging Face
|