Model Card: Llamathan-3B

Model Details

Model Name: llamathan-3B
Base Model: Meta AI – Llama2-3B
Model Type: Causal Language Model (Decoder-only Transformer)
Architecture: LLaMA 2
Fine-Tuning Method: Supervised Fine-Tuning (SFT)
Dataset Size: 3,202 instruction samples
Language: Primarily Tamil (Tanglish + Tamil technical explanations)

Model Description

This model is a fine-tuned version of Llama 2 3B Instruct, optimized for:

Tamil instruction-following
Code-mixed Tamil (Tanglish) explanations
Technical concept explanations in simplified Tamil
Educational Q&A style prompts

The model is trained using a structured instruction dataset with the following format:

{
  "instruction": "Explanation of Mixture of Experts (MoE).",
  "input": "Mixtral models-la 'MoE' na enna logic?",
  "output": "Motha model-aiyum orey nerathula use pannaama, specific question-ku endha 'Expert' (subset of neurons) best-nu router choose pannum. Performance high aagum aana cost kammi."
}

Training Details

Overall 3202 data samples :

CoT : Chain of Thoughts
SQL : Query explain & Query generation 
Tech terms : Explaination and working 
Multi step reasoning

Dataset Format

Each sample contains:

instruction → Task definition
input → User query (Tamil / Tanglish / Technical)
output → Expected response

Preprocessing Strategy

Prompt template used during training:

### Instruction:
{instruction}

### Input:
{input}

### Response:
{output}

The model was trained to predict only the Response portion autoregressively.

Training parameters

Epochs: 3
Batch Size: 8
Learning Rate: 5e-5
Optimizer: AdamW
LR Scheduler: Cosine decay
Max Sequence Length: 2048
Precision: bfloat16 / fp16
Gradient Accumulation: Enabled
Training Hardware: T4

Suitable For

Tamil educational assistants
Technical concept explanation in Tamil
AI/ML explanation chatbot
Code-mixed Tamil conversational agents

Limitations

Small dataset size (3,202 samples)
Possible hallucinations on unseen domains
May mix Tamil and English terminology inconsistently
Limited reasoning depth compared to larger models

Example Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "Hariharan05/Llamathan-3B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

prompt = """### Instruction:
Explanation of Mixture of Experts (MoE).

### Input:
Mixtral models-la 'MoE' na enna logic?

### Output:
"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Evaluation

Evaluation was performed using:

Manual qualitative assessment
Instruction-following accuracy
Tamil fluency and coherence checks

Downloads last month: -

GGUF

Model size

3B params

Architecture

llama

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support