๐ธ Happy Tamil New Year (เฎคเฎฎเฎฟเฎดเฏ เฎชเฏเฎคเฏเฎคเฎพเฎฃเฏเฎเฏ เฎตเฎพเฎดเฏเฎคเฏเฎคเฏเฎเฏเฎเฎณเฏ!) โ April 14, 2026!
Mnemic Glorious 31B โ CM-CoT v1
The world's first production model with native Code-Mixed Chain-of-Thought (CM-CoT) reasoning โ starting with Tanglish (Tamil+English).
Mnemic Glorious doesn't just respond in Tanglish โ it thinks in Tanglish. The model's internal reasoning (<think> blocks) contains genuine code-mixed deliberation, unlike frontier models which reason in English and translate at the output layer.
๐ง Key Insight: When given bare Tanglish prompts with zero system instructions, Mnemic Glorious maintains 33%+ Tanglish reasoning consistency across all categories โ a 3.5x improvement over the base model's CoT reasoning.
Why CM-CoT? The Translation Tax Problem
Bilingual speakers don't think in one language โ they think in a mix. But every existing AI model forces them into English-only reasoning.
When a Tamil-English speaker uses ChatGPT or Gemini for a long coding session, debugging session, or study marathon, their brain is constantly:
- Translating their thought into English to prompt the model
- Reading the English response
- Re-translating it back into their natural mixed-language thinking
This "translation tax" compounds over long sessions โ causing cognitive fatigue, slower comprehension, and reduced productivity. Research shows bilingual individuals process information faster and with less cognitive load in their dominant mixed-language mode.
CM-CoT eliminates this tax. When the AI thinks and responds in your natural language mix, there's no mental overhead. The model meets you where your brain already is.
This isn't just about language preference โ it's about cognitive efficiency at scale.
Why Tanglish, Not Tamil Script?
Most Tamil speakers can read Tamil script (เฎคเฎฎเฎฟเฎดเฏ), but when it comes to daily digital life โ texting friends, writing code, searching Google, chatting on WhatsApp โ they type in Romanized Tanglish using an English keyboard. Tamil keyboards exist, but almost nobody uses them for casual tech conversations.
Building a model that responds in Tamil script would mean users have to switch keyboards to prompt it and slow down to read unfamiliar script in a tech context. That defeats the purpose.
Mnemic Glorious speaks the language people actually type โ Roman Latin Tanglish. The same way you'd message your friend. The same way you'd ask a doubt in a Discord server. No keyboard switching, no script barriers, just natural conversation.
Model Details
| Base Model | Google Gemma 4 31B-IT |
| Method | QLoRA (SFT โ DPO) |
| Adapter Type | LoRA (PEFT) |
| SFT Dataset | 14,000 examples |
| DPO Pairs | 6,006 preference pairs |
| Total Training | 20,006 examples |
| Constitutional AI | 13-rule judge |
| Training Time | ~20 hours (8h43m SFT + 10h57m DPO) |
| Cost | ~$900 total |
| Language | Tanglish (Tamil-English code-mixed) |
| Script | Romanized Latin |
| Reasoning | CM-CoT (Code-Mixed Chain of Thought) |
What is CM-CoT?
Code-Mixed Chain of Thought (CM-CoT) is a reasoning paradigm where the model performs its internal deliberation in a code-mixed language โ in this case, Tanglish.
Most multilingual models think in English internally, then translate their output. Mnemic Glorious was trained on data where both the reasoning trace and the final answer are in natural Tanglish, the way a Tamil-English bilingual person genuinely thinks.
<think>
User life-la confused-a irukkanga, direction illama feel panranga.
First, avangala comfort panni, idhu normal dhaan nu sollanum.
Chinna chinna steps eduthu epdi focus panradhu nu advice pannanum.
Oru positive note-la mudikkanum.
</think>
Machi, first relax pannu. Life-la ipdi oru phase ellarukkum varum...
Self-Correction in Tanglish
The model doesn't just reason in Tanglish โ it self-corrects in Tanglish too. When it catches a mistake mid-thought, the correction happens naturally in code-mixed language:
<think>
Wait, .dict() ippo deprecated aayiduchu Pydantic v2 la.
Adhukku badhila .model_dump() use pannanum!
Idhu oru common mistake, neraya pazhaya tutorials la .dict() irukkum.
So, naan correct-a .model_dump(exclude_unset=True) use panni update panren.
</think>
This mirrors how a bilingual developer actually thinks โ catching errors, reasoning about alternatives, and correcting course, all in their natural mixed language.
Tanglish Code Comments
When generating code, comments come out in Tanglish too:
# Database connection establish pannurom
db = sqlite3.connect('urls.db')
# Random short code generate pannu
code = ''.join(random.choices(string.ascii_letters, k=6))
# Redirect URL-a return pannu
return redirect(original_url)
Your code. Your language. Even in the comments.
Zero-Shot Benchmark Results (vs Base Gemma 4 31B)
All evaluations done with bare Tanglish prompts, no system instructions โ testing native language capability without prompt engineering.
Summary Metrics
| Metric | Base Gemma 4 | Mnemic Glorious | Improvement |
|---|---|---|---|
| Avg CoT-Lang (thinking in Tanglish) | 10.1% | 35.4% | 3.5x more |
| Avg Tanglish Output | 18.5% | 33.3% | +80% |
| Thinking Token Efficiency | ~282 words/query | ~167 words/query | 40.9% fewer tokens |
| Self-Correction Rate | 60% | 90% | 1.5x better |
| Native thinking (no system prompt) | โ | โ | Trained-in behavior |
Per-Category Breakdown
| Category | Base Gemma 4 (Tanglish %) | Mnemic Glorious (Tanglish %) | Base CoT-Lang % | Glorious CoT-Lang % |
|---|---|---|---|---|
| Identity | 25.6 | 43.7 | 23.0 | 49.0 |
| Science | 21.4 | 42.8 | 10.4 | 38.5 |
| Emotional | 24.0 | 41.0 | 7.9 | 39.9 |
| Intervention | 17.6 | 41.6 | 8.3 | 45.8 |
| Logic | 17.5 | 33.9 | 10.5 | 30.3 |
| Parenting | 18.6 | 34.9 | 7.0 | 22.2 |
| Education | 16.7 | 33.6 | 6.8 | 35.5 |
| Career | 12.7 | 23.4 | 5.5 | 35.6 |
| Coding | 17.2 | 21.0 | 13.5 | 32.2 |
| Tech | 13.5 | 16.6 | 8.0 | 25.4 |
Key Findings
3.5x More Tanglish Reasoning: The base Gemma 4 model thinks primarily in English (avg CoT-Lang: 10.1%). After fine-tuning, Mnemic Glorious thinks natively in Tanglish (avg CoT-Lang: 35.4%).
40.9% Fewer Thinking Tokens: Native reasoning skips the translateโplanโre-translate overhead (282 โ 167 avg words per think block). Same quality reasoning, cheaper per query.
Tanglish Self-Correction: The model catches and corrects its own mistakes in Tanglish inside
<think>blocks โ 90% self-correction rate vs 60% for the base model.No System Prompt Needed: The base model requires explicit instructions to reason in Tanglish. Mnemic produces Tanglish reasoning from bare prompts โ proving it's a fine-tuned capability, not prompt engineering.
Domain-Appropriate Output: Thinks in Tanglish, but codes in clean Python/JavaScript. Uses the right language for the right context.
๐ Frontier model comparison (Claude, Gemini, GPT) coming soon โ direct API verification in progress.
Key Features
- Native Tanglish reasoning โ No system prompt needed. Thinking in Tanglish is trained behavior, not a prompt hack.
- 40.9% fewer thinking tokens โ Native reasoning uses 167 avg words vs base model's 282 per think block.
- Self-correction in Tanglish โ Model catches and fixes its own mistakes mid-reasoning, in the user's language.
- Tanglish code comments โ Code output includes comments in Tanglish.
- Domain-appropriate output โ Thinks in Tanglish, but codes in clean Python/JavaScript.
Use Cases
๐ค Agentic Tool Use
Build AI agents that reason in your user's language. When Mnemic Glorious is the backbone of a tool-calling agent, the entire chain-of-thought โ tool selection, parameter reasoning, error handling โ happens in Tanglish. No cognitive mismatch between the agent's reasoning and the user's mental model.
๐ป Long Coding Sessions
Developers who think in Tanglish spend hours fighting the translation tax with English-only models. Mnemic Glorious eliminates that overhead โ debug in your language, get code comments in your language, reason through architecture in your language.
๐ Education & Tutoring
Explain quantum physics, photosynthesis, or DSA to students in the language they actually think in. The model's CoT shows its work in Tanglish โ students learn reasoning patterns, not just answers.
๐ง Mental Health & Wellbeing
Empathetic, culturally-aware interventions in the user's native mixed language. English-only mental health tools feel clinical; Tanglish responses feel like a friend talking to you.
๐ผ Career & Professional Guidance
Resume building, interview prep, and career advice that doesn't force users to context-switch languages. The model adapts its formality โ Tanglish reasoning with English-format professional output.
Usage
With Unsloth (Recommended)
from unsloth import FastModel
from peft import PeftModel
model, tokenizer = FastModel.from_pretrained(
model_name="unsloth/gemma-4-31b-it",
max_seq_length=4096,
load_in_4bit=True,
)
model = PeftModel.from_pretrained(model, "MnemicAI/mnemic-glorious-31b-cmcot")
# Use the underlying text tokenizer (Gemma 4 returns a multimodal processor)
text_tokenizer = tokenizer.tokenizer if hasattr(tokenizer, 'tokenizer') else tokenizer
prompt = "<start_of_turn>user\nBro, nee yaar? Unna pathi sollu.<end_of_turn>\n<start_of_turn>model\n"
inputs = text_tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(
**inputs,
max_new_tokens=2048,
temperature=0.6,
repetition_penalty=1.15,
stop_strings=["<end_of_turn>"],
tokenizer=text_tokenizer,
)
print(text_tokenizer.decode(outputs[0], skip_special_tokens=False))
With PEFT + Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = "google/gemma-4-31b-it"
adapter = "MnemicAI/mnemic-glorious-31b-cmcot"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
base_model,
device_map="auto",
torch_dtype="auto",
)
model = PeftModel.from_pretrained(model, adapter)
prompt = "<start_of_turn>user\nBro, nee yaar? Unna pathi sollu.<end_of_turn>\n<start_of_turn>model\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
repetition_penalty=1.15,
stop_strings=["<end_of_turn>"],
tokenizer=tokenizer,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))
Prompt Format
The model responds best to casual Tanglish prompts. No system prompt needed.
Bro, quantum entanglement-a oru 10 year old kid-ku explain pannu
The model will automatically generate <think> blocks with Tanglish reasoning followed by a Tanglish response.
Training Details
- Base Model: Google Gemma 4 31B-IT
- Fine-tuning: QLoRA (4-bit quantization)
- SFT: 14,000 examples, ~8h43m training
- DPO: 6,006 preference pairs, ~10h57m training
- Constitutional AI: 13-rule judge for quality scoring
- Total Cost: ~$900
- Hardware: NVIDIA A100 / RTX PRO 6000 Blackwell (Google Colab)
Scaling Potential
Trained on 20,000 examples. The CM-CoT technique is proven and scalable โ this is v1, not the ceiling. The same approach transfers to any code-mixed language:
- ๐ฎ๐ณ Hinglish (Hindi+English) โ 350M speakers
- ๐ต๐ญ Taglish (Tagalog+English) โ 110M speakers
- ๐บ๐ธ Spanglish (Spanish+English) โ 40M speakers
- ๐ฒ๐พ Manglish (Malay+English) โ 30M speakers
Limitations
- Coding Tasks: Occasional repetition/degeneration in complex multi-file coding tasks. Primary target for future DPO alignment.
- Best For: Conversational and educational use cases.
- Not Evaluated: On formal NLP benchmarks (MMLU, etc.) โ this model targets a capability (native code-mixed reasoning) that existing benchmarks don't measure.
Roadmap
| Version | Feature | Status |
|---|---|---|
| v1 (current) | Pure Tanglish CM-CoT reasoning | โ Released |
| v1.1 | Identity awareness + persona consistency | ๐ง In progress |
| v2 | Tanglish โ Tamil script switching | ๐ Planned |
| v3 | Multilingual CM-CoT (Hinglish, Taglish) | ๐ Future |
Citation
@misc{mnemicai2026glorious,
title={Mnemic Glorious: Code-Mixed Chain-of-Thought for Tanglish},
author={MnemicAI},
year={2026},
publisher={Hugging Face},
url={https://huggingface.co/MnemicAI/mnemic-glorious-31b-cmcot}
}
License
This adapter inherits the Gemma License from the base model.
Released on Tamil New Year (Puthandu) 2026 ๐ธ Built by MnemicAI
- Downloads last month
- 97
