Llama3-3B Uruchi Instruct

Instruction-tuned version of Llama-3-3B optimized for concise, direct answers.

This model was fine-tuned using LoRA + Unsloth on a mixture of instruction datasets.

Model Overview

Attribute	Value
Model Name	llama3-3b-uruchi-instruct
Base Model	meta-llama/Llama-3-3B
Parameters	3B
Fine-tuning Method	LoRA
Framework	Unsloth
Model Format	safetensors
Author	Irfan Uruchi

Training

The model was fine-tuned using LoRA adapters with the Unsloth training framework for efficient GPU usage.

Training used a cleaned and merged instruction dataset containing approximately 107k samples.

Training configuration include LoRA fine-tuning with instruction-style prompts and its optimized for concise responses with dataset continuation filtering

Training Datasets

The training dataset consists of a mixture of open instruction datasets:

UltraChat

High-quality conversational instruction dataset.

OpenOrca

Reasoning and explanation dataset derived from GPT-style instruction generation.

GSM8K

Math reasoning dataset used to improve logical reasoning capabilities.

Final dataset size: ~107k instruction samples

Some further datasets were created by me for better Instruction Alignment / SFT Refinement.

Datasets were cleaned and filtered to remove malformed samples and dataset artifacts.

Prompt Format

The model expects the following prompt structure:

You are a concise AI assistant. Answer the user's question clearly and directly.

User question: {question}

Answer:

Example:

User question: What is 2+2?

Answer: 4

(a test_model.py will be included in repo for easier testing)

Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "irfanuruchi/llama3-3b-uruchi-instruct"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = """
You are a concise AI assistant.

User question:
Explain machine learning in one sentence.

Answer:
"""

inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    max_new_tokens=50,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Intended Behavior

This model is optimized for concise answers, factual responses, small amount of hallucination and simple explanations

Limitations

Small model size (3B parameters)
Limited deep reasoning capability
Not optimized for coding tasks
Context length limitations

License

This model is a derivative of Llama-3 and follows the Llama 3 Community License.

Base model: meta-llama/Llama-3-3B

Please follow the original license terms when using this model.

Downloads last month: 2

Safetensors

Model size

4B params

Tensor type

F32

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Irfanuruchi/llama3-3b-uruchi-instruct

Base model

meta-llama/Llama-3.2-3B-Instruct

Quantized

(488)

this model

Irfanuruchi
/

llama3-3b-uruchi-instruct