File size: 4,560 Bytes
bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 bd033e9 01f7d01 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 | ---
base_model: unsloth/Qwen3-4B-Base
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:unsloth/Qwen3-4B-Base
- grpo
- lora
- sft
- transformers
- trl
- unsloth
license: other
datasets:
- open-r1/OpenR1-Math-220k
language:
- pt
- en
---
# Model Card for DogeAI-v2.0-4B-Reasoning-LoRA
This repository contains a LoRA (Low-Rank Adaptation) fine-tuned on top of Qwen3-4B-Base, focused on improving reasoning, chain-of-thought coherence, and analytical responses.
The LoRA was trained using curated thinking-style datasets on Kaggle with the goal of enhancing logical consistency rather than factual memorization.
# Model Details
# Model Description
This is a reasoning-oriented LoRA adapter designed to be applied to Qwen3-4B-Base.
The training emphasizes structured thinking, multi-step reasoning, and clearer internal deliberation in responses.
Developed by: AxionLab-Co
Model type: LoRA adapter (PEFT)
Language(s) (NLP): Primarily English
License: Apache 2.0 (inherits base model license)
Finetuned from model: Qwen3-4B-Base
Model Sources
Base Model: Qwen3-4B-Base
Training Platform: Kaggle
Frameworks: PyTorch, PEFT, Unsloth
# Uses
# Direct Use
This LoRA is intended to be merged or loaded on top of Qwen3-4B-Base to improve:
Logical reasoning
Step-by-step problem solving
Analytical and structured responses
“Thinking-style” outputs for research and experimentation
# Downstream Use
Merging into a full model for GGUF or standard HF release
Further fine-tuning on domain-specific reasoning tasks
Research on symbolic + neural reasoning hybrids
# Out-of-Scope Use
Safety-critical decision making
Medical, legal, or financial advice
Tasks requiring guaranteed factual correctness
Bias, Risks, and Limitations
The model may overproduce reasoning steps, even when not strictly required
Reasoning quality depends heavily on the base model (Qwen3-4B-Base)
No formal safety fine-tuning was applied beyond the base model
Possible amplification of biases present in the original training data
# Recommendations
# Users should:
Apply external safety layers if deploying in production
Evaluate outputs critically, especially for sensitive topics
Avoid assuming reasoning chains are always correct
How to Get Started with the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3-4B-Base",
device_map="auto",
load_in_4bit=True
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B-Base")
model = PeftModel.from_pretrained(
base_model,
"AxionLab-Co/DogeAI-v2.0-4B-Reasoning-LoRA"
)
# Training Details
# Training Data
The LoRA was trained on thinking-oriented datasets, focusing on:
Chain-of-thought style reasoning
Logical explanations
Multi-step analytical prompts
The datasets were curated and preprocessed manually for quality and consistency.
# Training Procedure
# Preprocessing
Tokenization using the base Qwen tokenizer
Filtering of low-quality or malformed reasoning examples
Training Hyperparameters
Training regime: fp16 mixed precision
Fine-tuning method: LoRA (PEFT)
Optimizer: AdamW
Framework: Unsloth
Speeds, Sizes, Times
Training performed on Kaggle GPU environment
LoRA size kept intentionally lightweight for fast loading and merging
# Evaluation
Testing Data, Factors & Metrics
Testing Data
Internal prompt-based reasoning tests
Synthetic reasoning benchmarks (qualitative)
# Factors
Multi-step logic consistency
Response clarity
Hallucination tendencies
Metrics
Qualitative human evaluation
Prompt-level comparison against base model
# Results
The LoRA shows clear improvements in reasoning depth and structure compared to the base model, especially on analytical prompts.
Environmental Impact
Hardware Type: NVIDIA GPU (Kaggle)
Hours used: Few hours (single-session fine-tuning)
Cloud Provider: Kaggle
Compute Region: Unknown
Carbon Emitted: Not formally measured
# Technical Specifications
# Model Architecture and Objective
Transformer-based decoder-only architecture
Objective: enhance reasoning behavior via parameter-efficient fine-tuning
Compute Infrastructure
Hardware
Kaggle-provided NVIDIA GPU
Software
PyTorch
Transformers
PEFT 0.18.1
Unsloth
Citation
If you use this LoRA in research or derivative works, please cite the base model and this repository.
# Model Card Authors
**AxionLab-Co**
# Model Card Contact
For questions, experiments, or collaboration:
**AxionLab-Co on Hugging Face** |