🌸 Happy Tamil New Year (தமிழ் புத்தாண்டு வாழ்த்துக்கள்!) — April 14, 2026!

Mnemic Glorious 31B — CM-CoT v1

The world's first production model with native Code-Mixed Chain-of-Thought (CM-CoT) reasoning — starting with Tanglish (Tamil+English).

Mnemic Glorious doesn't just respond in Tanglish — it thinks in Tanglish. The model's internal reasoning (<think> blocks) contains genuine code-mixed deliberation, unlike frontier models which reason in English and translate at the output layer.

🧠 Key Insight: When given bare Tanglish prompts with zero system instructions, Mnemic Glorious maintains 33%+ Tanglish reasoning consistency across all categories — a 3.5x improvement over the base model's CoT reasoning.

Why CM-CoT? The Translation Tax Problem

Bilingual speakers don't think in one language — they think in a mix. But every existing AI model forces them into English-only reasoning.

When a Tamil-English speaker uses ChatGPT or Gemini for a long coding session, debugging session, or study marathon, their brain is constantly:

Translating their thought into English to prompt the model
Reading the English response
Re-translating it back into their natural mixed-language thinking

This "translation tax" compounds over long sessions — causing cognitive fatigue, slower comprehension, and reduced productivity. Research shows bilingual individuals process information faster and with less cognitive load in their dominant mixed-language mode.

CM-CoT eliminates this tax. When the AI thinks and responds in your natural language mix, there's no mental overhead. The model meets you where your brain already is.

This isn't just about language preference — it's about cognitive efficiency at scale.

Why Tanglish, Not Tamil Script?

Most Tamil speakers can read Tamil script (தமிழ்), but when it comes to daily digital life — texting friends, writing code, searching Google, chatting on WhatsApp — they type in Romanized Tanglish using an English keyboard. Tamil keyboards exist, but almost nobody uses them for casual tech conversations.

Building a model that responds in Tamil script would mean users have to switch keyboards to prompt it and slow down to read unfamiliar script in a tech context. That defeats the purpose.

Mnemic Glorious speaks the language people actually type — Roman Latin Tanglish. The same way you'd message your friend. The same way you'd ask a doubt in a Discord server. No keyboard switching, no script barriers, just natural conversation.

Model Details


Base Model	Google Gemma 4 31B-IT
Method	QLoRA (SFT → DPO)
Adapter Type	LoRA (PEFT)
SFT Dataset	14,000 examples
DPO Pairs	6,006 preference pairs
Total Training	20,006 examples
Constitutional AI	13-rule judge
Training Time	~20 hours (8h43m SFT + 10h57m DPO)
Cost	~$900 total
Language	Tanglish (Tamil-English code-mixed)
Script	Romanized Latin
Reasoning	CM-CoT (Code-Mixed Chain of Thought)

What is CM-CoT?

Code-Mixed Chain of Thought (CM-CoT) is a reasoning paradigm where the model performs its internal deliberation in a code-mixed language — in this case, Tanglish.

Most multilingual models think in English internally, then translate their output. Mnemic Glorious was trained on data where both the reasoning trace and the final answer are in natural Tanglish, the way a Tamil-English bilingual person genuinely thinks.

<think>
User life-la confused-a irukkanga, direction illama feel panranga.
First, avangala comfort panni, idhu normal dhaan nu sollanum.
Chinna chinna steps eduthu epdi focus panradhu nu advice pannanum.
Oru positive note-la mudikkanum.
</think>

Machi, first relax pannu. Life-la ipdi oru phase ellarukkum varum...

Self-Correction in Tanglish

The model doesn't just reason in Tanglish — it self-corrects in Tanglish too. When it catches a mistake mid-thought, the correction happens naturally in code-mixed language:

<think>
Wait, .dict() ippo deprecated aayiduchu Pydantic v2 la.
Adhukku badhila .model_dump() use pannanum!
Idhu oru common mistake, neraya pazhaya tutorials la .dict() irukkum.
So, naan correct-a .model_dump(exclude_unset=True) use panni update panren.
</think>

This mirrors how a bilingual developer actually thinks — catching errors, reasoning about alternatives, and correcting course, all in their natural mixed language.

Tanglish Code Comments

When generating code, comments come out in Tanglish too:

# Database connection establish pannurom
db = sqlite3.connect('urls.db')

# Random short code generate pannu
code = ''.join(random.choices(string.ascii_letters, k=6))

# Redirect URL-a return pannu
return redirect(original_url)

Your code. Your language. Even in the comments.

Zero-Shot Benchmark Results (vs Base Gemma 4 31B)

All evaluations done with bare Tanglish prompts, no system instructions — testing native language capability without prompt engineering.

Summary Metrics

Metric	Base Gemma 4	Mnemic Glorious	Improvement
Avg CoT-Lang (thinking in Tanglish)	10.1%	35.4%	3.5x more
Avg Tanglish Output	18.5%	33.3%	+80%
Thinking Token Efficiency	~282 words/query	~167 words/query	40.9% fewer tokens
Self-Correction Rate	60%	90%	1.5x better
Native thinking (no system prompt)	❌	✅	Trained-in behavior

Per-Category Breakdown

Category	Base Gemma 4 (Tanglish %)	Mnemic Glorious (Tanglish %)	Base CoT-Lang %	Glorious CoT-Lang %
Identity	25.6	43.7	23.0	49.0
Science	21.4	42.8	10.4	38.5
Emotional	24.0	41.0	7.9	39.9
Intervention	17.6	41.6	8.3	45.8
Logic	17.5	33.9	10.5	30.3
Parenting	18.6	34.9	7.0	22.2
Education	16.7	33.6	6.8	35.5
Career	12.7	23.4	5.5	35.6
Coding	17.2	21.0	13.5	32.2
Tech	13.5	16.6	8.0	25.4

Key Findings

3.5x More Tanglish Reasoning: The base Gemma 4 model thinks primarily in English (avg CoT-Lang: 10.1%). After fine-tuning, Mnemic Glorious thinks natively in Tanglish (avg CoT-Lang: 35.4%).
40.9% Fewer Thinking Tokens: Native reasoning skips the translate→plan→re-translate overhead (282 → 167 avg words per think block). Same quality reasoning, cheaper per query.
Tanglish Self-Correction: The model catches and corrects its own mistakes in Tanglish inside <think> blocks — 90% self-correction rate vs 60% for the base model.
No System Prompt Needed: The base model requires explicit instructions to reason in Tanglish. Mnemic produces Tanglish reasoning from bare prompts — proving it's a fine-tuned capability, not prompt engineering.
Domain-Appropriate Output: Thinks in Tanglish, but codes in clean Python/JavaScript. Uses the right language for the right context.

📊 Frontier model comparison (Claude, Gemini, GPT) coming soon — direct API verification in progress.

Key Features

Native Tanglish reasoning — No system prompt needed. Thinking in Tanglish is trained behavior, not a prompt hack.
40.9% fewer thinking tokens — Native reasoning uses 167 avg words vs base model's 282 per think block.
Self-correction in Tanglish — Model catches and fixes its own mistakes mid-reasoning, in the user's language.
Tanglish code comments — Code output includes comments in Tanglish.
Domain-appropriate output — Thinks in Tanglish, but codes in clean Python/JavaScript.

Use Cases

🤖 Agentic Tool Use

Build AI agents that reason in your user's language. When Mnemic Glorious is the backbone of a tool-calling agent, the entire chain-of-thought — tool selection, parameter reasoning, error handling — happens in Tanglish. No cognitive mismatch between the agent's reasoning and the user's mental model.

💻 Long Coding Sessions

Developers who think in Tanglish spend hours fighting the translation tax with English-only models. Mnemic Glorious eliminates that overhead — debug in your language, get code comments in your language, reason through architecture in your language.

📚 Education & Tutoring

Explain quantum physics, photosynthesis, or DSA to students in the language they actually think in. The model's CoT shows its work in Tanglish — students learn reasoning patterns, not just answers.

🧠 Mental Health & Wellbeing

Empathetic, culturally-aware interventions in the user's native mixed language. English-only mental health tools feel clinical; Tanglish responses feel like a friend talking to you.

💼 Career & Professional Guidance

Resume building, interview prep, and career advice that doesn't force users to context-switch languages. The model adapts its formality — Tanglish reasoning with English-format professional output.

Usage

With Unsloth (Recommended)

from unsloth import FastModel
from peft import PeftModel

model, tokenizer = FastModel.from_pretrained(
    model_name="unsloth/gemma-4-31b-it",
    max_seq_length=4096,
    load_in_4bit=True,
)
model = PeftModel.from_pretrained(model, "MnemicAI/mnemic-glorious-31b-cmcot")

# Use the underlying text tokenizer (Gemma 4 returns a multimodal processor)
text_tokenizer = tokenizer.tokenizer if hasattr(tokenizer, 'tokenizer') else tokenizer

prompt = "<start_of_turn>user\nBro, nee yaar? Unna pathi sollu.<end_of_turn>\n<start_of_turn>model\n"
inputs = text_tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(
    **inputs,
    max_new_tokens=2048,
    temperature=0.6,
    repetition_penalty=1.15,
    stop_strings=["<end_of_turn>"],
    tokenizer=text_tokenizer,
)
print(text_tokenizer.decode(outputs[0], skip_special_tokens=False))

With PEFT + Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = "google/gemma-4-31b-it"
adapter = "MnemicAI/mnemic-glorious-31b-cmcot"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    device_map="auto",
    torch_dtype="auto",
)
model = PeftModel.from_pretrained(model, adapter)

prompt = "<start_of_turn>user\nBro, nee yaar? Unna pathi sollu.<end_of_turn>\n<start_of_turn>model\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    repetition_penalty=1.15,
    stop_strings=["<end_of_turn>"],
    tokenizer=tokenizer,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))

Prompt Format

The model responds best to casual Tanglish prompts. No system prompt needed.

Bro, quantum entanglement-a oru 10 year old kid-ku explain pannu

The model will automatically generate <think> blocks with Tanglish reasoning followed by a Tanglish response.

Training Details

Base Model: Google Gemma 4 31B-IT
Fine-tuning: QLoRA (4-bit quantization)
SFT: 14,000 examples, ~8h43m training
DPO: 6,006 preference pairs, ~10h57m training
Constitutional AI: 13-rule judge for quality scoring
Total Cost: ~$900
Hardware: NVIDIA A100 / RTX PRO 6000 Blackwell (Google Colab)

Scaling Potential

Trained on 20,000 examples. The CM-CoT technique is proven and scalable — this is v1, not the ceiling. The same approach transfers to any code-mixed language:

🇮🇳 Hinglish (Hindi+English) — 350M speakers
🇵🇭 Taglish (Tagalog+English) — 110M speakers
🇺🇸 Spanglish (Spanish+English) — 40M speakers
🇲🇾 Manglish (Malay+English) — 30M speakers

Limitations

Coding Tasks: Occasional repetition/degeneration in complex multi-file coding tasks. Primary target for future DPO alignment.
Best For: Conversational and educational use cases.
Not Evaluated: On formal NLP benchmarks (MMLU, etc.) — this model targets a capability (native code-mixed reasoning) that existing benchmarks don't measure.

Roadmap

Version	Feature	Status
v1 (current)	Pure Tanglish CM-CoT reasoning	✅ Released
v1.1	Identity awareness + persona consistency	🔧 In progress
v2	Tanglish ↔ Tamil script switching	📋 Planned
v3	Multilingual CM-CoT (Hinglish, Taglish)	📋 Future

Citation

@misc{mnemicai2026glorious,
  title={Mnemic Glorious: Code-Mixed Chain-of-Thought for Tanglish},
  author={MnemicAI},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/MnemicAI/mnemic-glorious-31b-cmcot}
}

License

This adapter inherits the Gemma License from the base model.

Released on Tamil New Year (Puthandu) 2026 🌸 Built by MnemicAI

Downloads last month: 97

MnemicAI
/

mnemic-glorious-31b-cmcot