Instructions to use prithivMLmods/Qwen2.5-3B-Tamil-Exp with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use prithivMLmods/Qwen2.5-3B-Tamil-Exp with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="prithivMLmods/Qwen2.5-3B-Tamil-Exp")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/Qwen2.5-3B-Tamil-Exp")
model = AutoModelForCausalLM.from_pretrained("prithivMLmods/Qwen2.5-3B-Tamil-Exp", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use prithivMLmods/Qwen2.5-3B-Tamil-Exp with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "prithivMLmods/Qwen2.5-3B-Tamil-Exp"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Qwen2.5-3B-Tamil-Exp",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/prithivMLmods/Qwen2.5-3B-Tamil-Exp

SGLang

How to use prithivMLmods/Qwen2.5-3B-Tamil-Exp with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "prithivMLmods/Qwen2.5-3B-Tamil-Exp" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Qwen2.5-3B-Tamil-Exp",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "prithivMLmods/Qwen2.5-3B-Tamil-Exp" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Qwen2.5-3B-Tamil-Exp",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use prithivMLmods/Qwen2.5-3B-Tamil-Exp with Docker Model Runner:
```
docker model run hf.co/prithivMLmods/Qwen2.5-3B-Tamil-Exp
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Qwen2.5-3B-Tamil-Exp

Qwen2.5-3B-Tamil-Exp is built on the robust Qwen2.5 architecture and has been specifically adapted to excel at Tamil language tasks. By incorporating training log entries from the prithivMLmods/Deepthink-Reasoning-Tamil dataset along with the proven reasoning framework of Qwen models, this 3B-parameter variant achieves enhanced chain-of-thought reasoning and logical problem solving—especially tailored for Tamil. Its improvements extend to context understanding, structured data processing, and long-context comprehension, making it ideal for complex reasoning tasks, instruction-following, and text generation in Tamil and other languages.

Key Improvements

Advanced Reasoning & Logic:
Optimized for multi-step problem solving and logical deduction. Fine-tuning on the Deepthink-Reasoning-Tamil entries further refines its reasoning capabilities in Tamil contexts.
Fine-Tuned Instruction Following:
Generates precise responses and structured outputs (such as JSON), making it well-suited for dialog-based applications and code generation tasks that require strict adherence to Tamil language instructions.
Greater Adaptability:
Excels in role-playing scenarios, multi-turn dialogues, and diverse system prompts with a focus on culturally nuanced Tamil content while maintaining support for multiple languages.
Long-Context Support:
Capable of handling extended inputs (up to 64K tokens) and generating outputs of up to 4K tokens, enabling the processing of detailed and lengthy Tamil texts.
Multilingual Proficiency with Tamil Focus:
While supporting over 20 languages, the model’s training emphasis on Tamil ensures superior performance on tasks involving Tamil language understanding and generation.

Intended Use

Advanced Logical & Analytical Reasoning:
Ideal for solving multi-step problems and deductive reasoning tasks, especially those presented in Tamil.
Mathematical & Scientific Computation:
Supports theorem proving, complex calculations, and retrieval of scientific knowledge with an emphasis on Tamil terminology.
Code Generation & Debugging:
Generates optimized code, detects errors, and enhances programming workflows with support for Tamil documentation or comments.
Structured Data Analysis:
Processes tables, JSON, and other structured formats, which is particularly useful for localized applications requiring Tamil language outputs.
Multilingual Reasoning & Translation:
While excelling in Tamil, it is also proficient in other languages for international applications.
Extended Text Generation:
Capable of producing research papers, instructional guides, and in-depth reports in Tamil.

Quickstart with Transformers

Below is an example of how to load and use the model with the Hugging Face Transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "your_org/Qwen2.5-3B-Tamil-Exp"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "தமிழில் தர்க்கரீதியான எண்ணத்தை விளக்குங்கள்."  # "Explain the concept of logical reasoning in Tamil."
messages = [
    {"role": "system", "content": "நீங்கள் ஒரு தமிழில் சிறந்த தர்க்கரீதியான உதவியாளர்."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=256
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Limitations

Moderate Computational Requirements:
Requires mid-end consumer GPUs for optimal inference.
Language-Specific Variability:
While performance is strong for Tamil, results may vary for other supported languages.
Potential Error Accumulation:
Extended outputs may sometimes introduce inconsistencies.
Limited Real-World Awareness:
The model’s knowledge is based on its training data and may not include recent events.
Prompt Sensitivity:
High-quality responses depend on the clarity and specificity of the input prompt.

Downloads last month: 17

Safetensors

Model size

3B params

Tensor type

F16

Model tree for prithivMLmods/Qwen2.5-3B-Tamil-Exp

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Finetuned

(1455)

this model

Quantizations

2 models

prithivMLmods
/

Qwen2.5-3B-Tamil-Exp