AxionLab-official commited on
Commit
676b8cf
·
verified ·
1 Parent(s): 0d2c2bf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -403
README.md CHANGED
@@ -10,422 +10,73 @@ pipeline_tag: text-generation
10
  datasets:
11
  - nvidia/OpenMathReasoning
12
  ---
13
- 🧠 DogeAI-v2.0-4B-Reasoning
14
- 📌 Model Details
15
- Model Description
16
-
17
- DogeAI-v2.0-4B-Reasoning is a language model focused on reasoning, structured thinking, and analytical responses, created from merging a reasoning LoRA onto the Qwen3-4B-Base model.
18
-
19
- The main objective of this model is to improve logical coherence, the ability to solve problems in multiple steps, and explanatory clarity, without drastically altering the overall behavior of the base model.
20
-
21
- This model represents the merged and final version, and can be used without dependence on external LoRA.
22
-
23
- Developed by: AxionLab-Co
24
-
25
- Funded by: Independent / Community-driven
26
-
27
- Shared by: AxionLab-Co
28
-
29
- Model type: Decoder-only Transformer (Causal Language Model)
30
-
31
- Language(s) (NLP): Primarily English
32
-
33
- License: Apache 2.0 (inherits from base model)
34
-
35
- Finetuned from model: Qwen3-4B-Base
36
-
37
- 🔗 Model Sources
38
- Repository: Hugging Face – AxionLab-Co/DogeAI-v2.0-4B-Reasoning
39
-
40
- Base Model: Qwen/Qwen3-4B-Base
41
-
42
- Training Platform: Kaggle
43
-
44
- Frameworks: PyTorch, Transformers, PEFT
45
-
46
- 🎯 Uses
47
- Direct Use
48
- This model can be used directly for:
49
-
50
- Logical and analytical reasoning
51
-
52
- Multi-step problem solving
53
-
54
- Detailed explanations (“Thinking-Style Responses”)
55
-
56
- AI Research, Experimentation, and Learning
57
-
58
- Downstream Use
59
-
60
- Conversational agents focused on reasoning
61
-
62
- Additional fine-tuning in specific domains
63
-
64
- Conversion to GGUF and use in engines like llama.cpp
65
-
66
- Academic or experimental research
67
-
68
- Out-of-Scope Use
69
-
70
- This model is not recommended for:
71
-
72
- Medical, legal, or financial decisions
73
-
74
- Critical safety applications
75
-
76
- Use where absolute factuality is mandatory
77
-
78
- ⚠️ Bias, Risks, and Limitations
79
- May generate excessive reasoning chains, even when unnecessary
80
-
81
- Inherited potential biases from the base model and training data
82
-
83
- Has not undergone specific alignment or safety fine-tuning
84
-
85
- Generated reasoning is not guaranteed to be correct
86
-
87
- Recommendations
88
-
89
- Users should:
90
-
91
- Critically evaluate responses
92
-
93
- Use additional layers of security in production
94
-
95
- Avoid blindly trusting chains of reasoning
96
-
97
- 🚀 How to Get Started with the Model
98
- '' from transformers import AutoModelForCausalLM, AutoTokenizer
99
-
100
- model = AutoModelForCausalLM.from_pretrained( "AxionLab-Co/DogeAI-v2.0-4B-Reasoning", device_map="auto", torch_dtype="auto" )
101
-
102
- tokenizer = AutoTokenizer.from_pretrained( "AxionLab-Co/DogeAI-v2.0-4B-Reasoning" )
103
-
104
- inputs = tokenizer("Solve this step by step:", return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=256)
105
-
106
- print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ''
107
-
108
- 🏋️Training Details
109
- Training Data
110
-
111
- The model was fitted using datasets focused on reasoning and chain-of-thought, containing:
112
-
113
- Step-by-step problem solving
114
-
115
- Structured explanatory responses
116
-
117
- Synthetic and curated analytical prompts
118
-
119
- The data were manually pre-processed to improve quality and consistency.
120
-
121
- Training Procedure Preprocessing
122
-
123
- Tokenization with Qwen's original tokenizer
124
-
125
- Filtering of inconsistent or low-quality examples
126
-
127
- Training Hyperparameters
128
-
129
- Training regime: fp16 mixed precision
130
-
131
- Fine-tuning method: LoRA (PEFT)
132
-
133
- Optimizer: AdamW
134
-
135
- Framework: Transformers + PEFT
136
-
137
- Speeds, Sizes, Times
138
-
139
- Training performed on Kaggle GPU
140
-
141
- LoRA intentionally kept lightweight
142
-
143
- Final merge performed via PEFT (merge_and_unload)
144
-
145
- 📊 Evaluation
146
- Testing Data, Factors & Metrics Testing Data
147
-
148
- Manual reasoning prompts
149
-
150
- Direct comparison with the base model
151
-
152
- Factors
153
-
154
- Clarity of reasoning
155
-
156
- Logical coherence
157
-
158
- Tendency to hallucination
159
-
160
- Metrics
161
-
162
- Qualitative human evaluation
163
-
164
- Subjective comparison of responses
165
-
166
- Results
167
-
168
- The model demonstrates better logical organization and more concise explanations Consistent in direct comparison with Qwen3-4B-Base.
169
-
170
- Summary
171
-
172
- DogeAI-v2.0-4B-Reasoning prioritizes quality of thought, not just textual fluency.
173
-
174
- 🌱 Environmental Impact
175
- Hardware Type: NVIDIA GPU (Kaggle)
176
-
177
- Hours used: Few hours (single-session fine-tuning + merge)
178
-
179
- Cloud Provider: Kaggle
180
-
181
- Compute Region: Unknown
182
-
183
- Carbon Emitted: Not measured
184
-
185
- ⚙️ Technical Specifications
186
- Model Architecture and Objective
187
- Decoder-only Transformer
188
-
189
- Objective: Improve reasoning via efficient fine-tuning
190
-
191
- Compute Infrastructure Hardware
192
-
193
- NVIDIA GPU (Kaggle environment)
194
-
195
- Software
196
-
197
- PyTorch
198
-
199
- Transformers
200
-
201
- PEFT 0.18.1
202
-
203
- 📚 Citation
204
- If you use this model in research or derivative projects, please cite the base model and this repository.
205
-
206
- 👥 Model Card Authors
207
- AxionLab-Co
208
-
209
- 📬 Model Card Contact
210
- For questions, feedback, or collaboration: AxionLab-Co – Hugging Face
211
- # --FOR PORTUGUESE READERS --
212
  # 🧠 DogeAI-v2.0-4B-Reasoning
213
- # 📌 Model Details
214
- **Model Description**
215
-
216
- DogeAI-v2.0-4B-Reasoning é um modelo de linguagem focado em raciocínio, pensamento estruturado e respostas analíticas, criado a partir do merge de uma LoRA de reasoning sobre o modelo base Qwen3-4B-Base.
217
 
218
- O objetivo principal deste modelo é melhorar a coerência lógica, a capacidade de resolver problemas em múltiplos passos e a clareza explicativa, sem alterar drasticamente o comportamento geral do modelo base.
219
 
220
- Este modelo representa a versão merged e final, podendo ser utilizado sem dependência de LoRA externa.
 
 
 
 
221
 
222
- Developed by: AxionLab-Co
223
-
224
- Funded by: Independent / Community-driven
225
-
226
- Shared by: AxionLab-Co
227
-
228
- Model type: Decoder-only Transformer (Causal Language Model)
229
-
230
- Language(s) (NLP): Primarily English
231
-
232
- License: Apache 2.0 (inherits from base model)
233
-
234
- Finetuned from model: Qwen3-4B-Base
235
-
236
- # 🔗 Model Sources
237
-
238
- Repository: Hugging Face – AxionLab-Co/DogeAI-v2.0-4B-Reasoning
239
-
240
- Base Model: Qwen/Qwen3-4B-Base
241
-
242
- Training Platform: Kaggle
243
-
244
- Frameworks: PyTorch, Transformers, PEFT
245
-
246
- # 🎯 Uses
247
- # Direct Use
248
-
249
- Este modelo pode ser utilizado diretamente para:
250
-
251
- Raciocínio lógico e analítico
252
-
253
- Resolução de problemas em múltiplos passos
254
-
255
- Explicações detalhadas (“thinking-style responses”)
256
-
257
- Pesquisa, experimentação e aprendizado em IA
258
-
259
- Downstream Use
260
-
261
- Conversational agents focados em reasoning
262
-
263
- Fine-tuning adicional em domínios específicos
264
-
265
- Conversão para GGUF e uso em engines como llama.cpp
266
-
267
- Pesquisa acadêmica ou experimental
268
-
269
- Out-of-Scope Use
270
-
271
- Este modelo não é recomendado para:
272
-
273
- Decisões médicas, legais ou financeiras
274
-
275
- Aplicações críticas de segurança
276
-
277
- Uso onde factualidade absoluta é obrigatória
278
-
279
- # ⚠️ Bias, Risks, and Limitations
280
-
281
- Pode gerar cadeias de raciocínio excessivas, mesmo quando não necessárias
282
-
283
- Herdou possíveis vieses do modelo base e dos dados de treino
284
-
285
- Não passou por fine-tuning específico de alinhamento ou safety
286
-
287
- Raciocínios gerados não são garantidamente corretos
288
-
289
- Recommendations
290
-
291
- Usuários devem:
292
 
293
- Avaliar criticamente as respostas
 
 
 
294
 
295
- Utilizar camadas adicionais de segurança em produção
296
 
297
- Evitar confiar cegamente em cadeias de raciocínio
298
 
299
- # 🚀 How to Get Started with the Model
300
- '' from transformers import AutoModelForCausalLM, AutoTokenizer
 
301
 
 
302
 
 
303
  model = AutoModelForCausalLM.from_pretrained(
304
- "AxionLab-Co/DogeAI-v2.0-4B-Reasoning",
305
- device_map="auto",
306
- torch_dtype="auto"
307
  )
308
 
 
 
309
 
310
- tokenizer = AutoTokenizer.from_pretrained(
311
- "AxionLab-Co/DogeAI-v2.0-4B-Reasoning"
 
 
 
312
  )
313
 
314
-
315
- inputs = tokenizer("Solve this step by step:", return_tensors="pt").to(model.device)
316
- outputs = model.generate(**inputs, max_new_tokens=256)
317
-
318
-
319
- print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ''
320
- # 🏋️ Training Details
321
- Training Data
322
-
323
- O modelo foi ajustado utilizando datasets focados em reasoning e chain-of-thought, contendo:
324
-
325
- Resolução passo a passo de problemas
326
-
327
- Respostas explicativas estruturadas
328
-
329
- Prompts analíticos sintéticos e curados
330
-
331
- Os dados foram pré-processados manualmente para melhorar qualidade e consistência.
332
-
333
- Training Procedure
334
- Preprocessing
335
-
336
- Tokenização com tokenizer original do Qwen
337
-
338
- Filtragem de exemplos inconsistentes ou de baixa qualidade
339
-
340
- Training Hyperparameters
341
-
342
- Training regime: fp16 mixed precision
343
-
344
- Fine-tuning method: LoRA (PEFT)
345
-
346
- Optimizer: AdamW
347
-
348
- Framework: Transformers + PEFT
349
-
350
- Speeds, Sizes, Times
351
-
352
- Treinamento realizado em GPU do Kaggle
353
-
354
- LoRA mantida propositalmente leve
355
-
356
- Merge final realizado via PEFT (merge_and_unload)
357
-
358
- # 📊 Evaluation
359
- Testing Data, Factors & Metrics
360
- Testing Data
361
-
362
- Prompts manuais de reasoning
363
-
364
- Comparação direta com o modelo base
365
-
366
- Factors
367
-
368
- Clareza do raciocínio
369
-
370
- Coerência lógica
371
-
372
- Tendência a alucinação
373
-
374
- Metrics
375
-
376
- Avaliação qualitativa humana
377
-
378
- Comparação subjetiva de respostas
379
-
380
- Results
381
-
382
- O modelo demonstra melhor organização lógica e explicações mais consistentes em comparação direta com o Qwen3-4B-Base.
383
-
384
- Summary
385
-
386
- DogeAI-v2.0-4B-Reasoning prioriza qualidade de pensamento, não apenas fluência textual.
387
-
388
- # 🌱 Environmental Impact
389
-
390
- Hardware Type: NVIDIA GPU (Kaggle)
391
-
392
- Hours used: Few hours (single-session fine-tuning + merge)
393
-
394
- Cloud Provider: Kaggle
395
-
396
- Compute Region: Unknown
397
-
398
- Carbon Emitted: Not measured
399
-
400
- # ⚙️ Technical Specifications
401
- # Model Architecture and Objective
402
-
403
- Decoder-only Transformer
404
-
405
- Objetivo: melhorar raciocínio via fine-tuning eficiente
406
-
407
- Compute Infrastructure
408
- Hardware
409
-
410
- NVIDIA GPU (Kaggle environment)
411
-
412
- Software
413
-
414
- PyTorch
415
-
416
- Transformers
417
-
418
- PEFT 0.18.1
419
-
420
- # 📚 Citation
421
-
422
- Se você utilizar este modelo em pesquisas ou projetos derivados, cite o modelo base e este repositório.
423
-
424
- # 👥 Model Card Authors
425
-
426
- AxionLab-Co
427
-
428
- # 📬 Model Card Contact
429
-
430
- Para dúvidas, feedback ou colaboração:
431
- AxionLab-Co – Hugging Face
 
10
  datasets:
11
  - nvidia/OpenMathReasoning
12
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  # 🧠 DogeAI-v2.0-4B-Reasoning
14
+ **"The Small Model That Thinks Big."**
 
 
 
15
 
16
+ DogeAI-v2.0-4B-Reasoning is a high-efficiency model optimized for **Chain-of-Thought (CoT)**. Built by [AxionLab-Co](https://huggingface.co), it merges a specialized reasoning LoRA onto the powerful **Qwen3-4B-Base** architecture, delivering structured, step-by-step analytical capabilities in a compact 4B footprint.
17
 
18
+ ### 🚀 Key Highlights
19
+ - **Architecture:** Decoder-only Transformer (Qwen3 Base).
20
+ - **Core Strength:** Multi-step logical reasoning and structured problem solving.
21
+ - **Hardware Friendly:** Optimized for local inference (Low VRAM usage).
22
+ - **Final Merge:** No LoRA dependency; ready for production or GGUF conversion.
23
 
24
+ ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
+ ## 🎯 Use Cases
27
+ - **Complex Problem Solving:** Math, logic, and analytical tasks.
28
+ - **Detailed Explanations:** When you need the "why" and "how", not just the "what".
29
+ - **Local Agents:** High-performance reasoning for edge devices and local LLM setups.
30
 
31
+ ---
32
 
33
+ ## 🛠️ Quick Start
34
 
35
+ ```python
36
+ from transformers import AutoModelForCausalLM, AutoTokenizer
37
+ import torch
38
 
39
+ model_id = "AxionLab-Co/DogeAI-v2.0-4B-Reasoning"
40
 
41
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
42
  model = AutoModelForCausalLM.from_pretrained(
43
+ model_id,
44
+ device_map="auto",
45
+ torch_dtype=torch.bfloat16 # Recommended for Qwen3
46
  )
47
 
48
+ prompt = "Solve this step-by-step: If a train leaves at 2 PM at 60mph, and another..."
49
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
50
 
51
+ outputs = model.generate(
52
+ **inputs,
53
+ max_new_tokens=512,
54
+ temperature=0.3, # Lower temp recommended for reasoning
55
+ do_sample=True
56
  )
57
 
58
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
59
+
60
+ 🏋️ Training & Methodology
61
+ Our goal at AxionLab was to prioritize Depth of Thought over mere textual fluency.
62
+ Dataset: A curated mix of synthetic CoT datasets and manually pre-processed logical reasoning prompts.
63
+ Fine-tuning: Performed on Kaggle GPUs using PEFT (LoRA) with a focus on preserving the base model's knowledge while injecting structured logic.
64
+ Optimization: Mixed precision (fp16) with a final merge_and_unload for seamless deployment.
65
+
66
+ 📊 Evaluation Results
67
+ In qualitative testing, DogeAI-v2.0-4B shows:
68
+ Higher Logical Consistency compared to the stock Qwen3-4B-Base.
69
+ Reduced Hallucination in multi-step word problems.
70
+ Structured Verbosity: It "thinks" before it answers.
71
+
72
+ ⚠️ Limitations & Bias
73
+ Reasoning Loops: The model might occasionally over-explain simple tasks.
74
+ Safety: No specific safety RLHF has been applied. Use with external safety guardrails in production.
75
+ Factuality: While logic is improved, it can still hallucinate complex facts.
76
+
77
+ 🤝 Contact & Collaboration
78
+ Developed with ❤️ by AxionLab-Co.
79
+ We are an independent, community-driven lab focused on efficient AI.
80
+ Organization: AxionLab-official
81
+ Feedback: Open a Discussion on this repo!
82
+ Language Support: Primarily English. Portuguese support is available but may vary in reasoning depth.