Introducing Tamil-R1: First Reasoning Model That THINKS in Tamil
Tamil-R1: First AI That Actually THINKS in Tamil (Yeah, verified with Google)
Hi everyone! 👋
I'm excited to share Tamil-R1 - the first reasoning model that performs chain-of-thought reasoning natively in Tamil.
The Problem I Solved
I tested ChatGPT o1, DeepSeek R1, Sarvam-M, Claude, and Gemini with Tamil questions. Every single one showed reasoning in English and then translated to Tamil.
Even Sarvam-M (built in India with $41M funding) thinks in English when you click "Show Thinking."
That's not Tamil AI. That's English AI in Tamil clothing.
What Makes Tamil-R1 Different
Other models:
Tamil question → English thinking → Tamil answer
Tamil-R1:
Tamil question → Tamil thinking → Tamil answer
Example:
Question:
காய்கறி வாங்கணும். பக்கத்து கடை ₹45, தூர கடை ₹35. எது நல்லது?
(Need to buy vegetables. Nearby shop ₹45, distant shop ₹35. Which is better?)
Tamil-R1 Thinking:
<think>
இரண்டு வழி பார்ப்போம்:
பக்கத்து கடை:
- விலை = ₹45
- நேரம் = 5 நிமிடம்
- வசதியானது
தூர கடை:
- விலை = ₹35
- நேரம் = 30 நிமிடம்
- கஷ்டம்
நேர மதிப்பு கணக்கு:
- ₹10 சேமிப்புக்கு 25 நிமிடம் கூடுதல்
- மணிக்கு ₹24 மதிப்பு மட்டும்
- உன் நேரம் இதுக்கு அதிக மதிப்பு
</think>
Answer:
பக்கத்து கடையே போ. ₹10 சேமிப்புக்கு 25 நிமிடம் பயணம் மதிப்பு இல்ல.
நேரம் = பணத்தை விட முக்கியம்.
See the difference? Pure Tamil reasoning. No translation artifacts. No English thinking.
How I Built It
- Time: 2 weeks
- Cost: ₹0 (using Kaggle free GPU)
- Dataset: 100 hand-crafted Tamil reasoning examples
- Method: Fine-tuned DeepSeek-R1-Distill-Llama-8B using LoRA
- Training: 30 minutes on Tesla T4 x2
- License: MIT (100% open source)
Technical Details
- Base Model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
- Fine-tuning Method: LoRA (rank 16, alpha 16)
- Training Data: 100 Tamil examples across 5 categories:
- Shopping & Money decisions
- Home & Cooking planning
- Daily decisions
- Health & Routine
- Common sense reasoning
- Training Loss: 2.73 → 2.07 (3 epochs)
- Format: User question →
<think>Tamil reasoning</think>→ Answer
Limitations (v0.1)
⚠️ This is a proof-of-concept with only 100 training examples:
- May generate repetitive or incomplete outputs
- Limited to everyday scenarios (not complex academic problems)
- Needs 5-10x more training data for production use
- No standard benchmark evaluation yet
This is v0.1 - rough around the edges, but it WORKS.
Why This Matters
70 million Tamil speakers deserve AI that:
- ✅ Thinks in their language
- ✅ Understands cultural context
- ✅ Reasons naturally without translation artifacts
- ✅ Shows transparent thinking process
Language shapes thought. When AI thinks in English and translates, it loses:
- Cultural nuance
- Natural reasoning patterns
- Linguistic authenticity
Try It Yourself
Note: This is my first model.