Llama3-3B Uruchi Instruct
Instruction-tuned version of Llama-3-3B optimized for concise, direct answers.
This model was fine-tuned using LoRA + Unsloth on a mixture of instruction datasets.
Model Overview
| Attribute | Value |
|---|---|
| Model Name | llama3-3b-uruchi-instruct |
| Base Model | meta-llama/Llama-3-3B |
| Parameters | 3B |
| Fine-tuning Method | LoRA |
| Framework | Unsloth |
| Model Format | safetensors |
| Author | Irfan Uruchi |
Training
The model was fine-tuned using LoRA adapters with the Unsloth training framework for efficient GPU usage.
Training used a cleaned and merged instruction dataset containing approximately 107k samples.
Training configuration include LoRA fine-tuning with instruction-style prompts and its optimized for concise responses with dataset continuation filtering
Training Datasets
The training dataset consists of a mixture of open instruction datasets:
UltraChat
High-quality conversational instruction dataset.
OpenOrca
Reasoning and explanation dataset derived from GPT-style instruction generation.
GSM8K
Math reasoning dataset used to improve logical reasoning capabilities.
Final dataset size: ~107k instruction samples
Some further datasets were created by me for better Instruction Alignment / SFT Refinement.
Datasets were cleaned and filtered to remove malformed samples and dataset artifacts.
Prompt Format
The model expects the following prompt structure:
You are a concise AI assistant. Answer the user's question clearly and directly.
User question: {question}
Answer:
Example:
User question: What is 2+2?
Answer: 4
(a test_model.py will be included in repo for easier testing)
Example Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "irfanuruchi/llama3-3b-uruchi-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
prompt = """
You are a concise AI assistant.
User question:
Explain machine learning in one sentence.
Answer:
"""
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
**inputs,
max_new_tokens=50,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Intended Behavior
This model is optimized for concise answers, factual responses, small amount of hallucination and simple explanations
Limitations
- Small model size (3B parameters)
- Limited deep reasoning capability
- Not optimized for coding tasks
- Context length limitations
License
This model is a derivative of Llama-3 and follows the Llama 3 Community License.
Base model: meta-llama/Llama-3-3B
Please follow the original license terms when using this model.
- Downloads last month
- 17
Model tree for Irfanuruchi/llama3-3b-uruchi-instruct
Base model
meta-llama/Llama-3.2-3B-Instruct