Model Card for DogeAI-v2.0-4B-Reasoning-LoRA
This repository contains a LoRA (Low-Rank Adaptation) fine-tuned on top of Qwen3-4B-Base, focused on improving reasoning, chain-of-thought coherence, and analytical responses. The LoRA was trained using curated thinking-style datasets on Kaggle with the goal of enhancing logical consistency rather than factual memorization.
Model Details
Model Description
This is a reasoning-oriented LoRA adapter designed to be applied to Qwen3-4B-Base. The training emphasizes structured thinking, multi-step reasoning, and clearer internal deliberation in responses.
Developed by: AxionLab-Co
Model type: LoRA adapter (PEFT)
Language(s) (NLP): Primarily English
License: Apache 2.0 (inherits base model license)
Finetuned from model: Qwen3-4B-Base
Model Sources
Base Model: Qwen3-4B-Base
Training Platform: Kaggle
Frameworks: PyTorch, PEFT, Unsloth
Uses
Direct Use
This LoRA is intended to be merged or loaded on top of Qwen3-4B-Base to improve:
Logical reasoning
Step-by-step problem solving
Analytical and structured responses
“Thinking-style” outputs for research and experimentation
Downstream Use
Merging into a full model for GGUF or standard HF release
Further fine-tuning on domain-specific reasoning tasks
Research on symbolic + neural reasoning hybrids
Out-of-Scope Use
Safety-critical decision making
Medical, legal, or financial advice
Tasks requiring guaranteed factual correctness
Bias, Risks, and Limitations
The model may overproduce reasoning steps, even when not strictly required
Reasoning quality depends heavily on the base model (Qwen3-4B-Base)
No formal safety fine-tuning was applied beyond the base model
Possible amplification of biases present in the original training data
Recommendations
Users should:
Apply external safety layers if deploying in production
Evaluate outputs critically, especially for sensitive topics
Avoid assuming reasoning chains are always correct
How to Get Started with the Model from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen3-4B-Base", device_map="auto", load_in_4bit=True )
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B-Base")
model = PeftModel.from_pretrained( base_model, "AxionLab-Co/DogeAI-v2.0-4B-Reasoning-LoRA" )
Training Details
Training Data
The LoRA was trained on thinking-oriented datasets, focusing on:
Chain-of-thought style reasoning
Logical explanations
Multi-step analytical prompts
The datasets were curated and preprocessed manually for quality and consistency.
Training Procedure
Preprocessing
Tokenization using the base Qwen tokenizer
Filtering of low-quality or malformed reasoning examples
Training Hyperparameters
Training regime: fp16 mixed precision
Fine-tuning method: LoRA (PEFT)
Optimizer: AdamW
Framework: Unsloth
Speeds, Sizes, Times
Training performed on Kaggle GPU environment
LoRA size kept intentionally lightweight for fast loading and merging
Evaluation
Testing Data, Factors & Metrics Testing Data
Internal prompt-based reasoning tests
Synthetic reasoning benchmarks (qualitative)
Factors
Multi-step logic consistency
Response clarity
Hallucination tendencies
Metrics
Qualitative human evaluation
Prompt-level comparison against base model
Results
The LoRA shows clear improvements in reasoning depth and structure compared to the base model, especially on analytical prompts.
Environmental Impact
Hardware Type: NVIDIA GPU (Kaggle)
Hours used: Few hours (single-session fine-tuning)
Cloud Provider: Kaggle
Compute Region: Unknown
Carbon Emitted: Not formally measured
Technical Specifications
Model Architecture and Objective
Transformer-based decoder-only architecture
Objective: enhance reasoning behavior via parameter-efficient fine-tuning
Compute Infrastructure Hardware
Kaggle-provided NVIDIA GPU
Software
PyTorch
Transformers
PEFT 0.18.1
Unsloth
Citation
If you use this LoRA in research or derivative works, please cite the base model and this repository.
Model Card Authors
AxionLab-Co
Model Card Contact
For questions, experiments, or collaboration: AxionLab-Co on Hugging Face
- Downloads last month
- 11