TLF-7B-LLM-01
Model Description
This model is a fine-tuned version of sarvamai/sarvam-1 specialized for Bilingual Indic Lexicography.
It has been trained to provide structured morphological breakdowns, definitions, and regional translations for Sanskrit and other Indian regional languages.
The training data was ingested through the TLF Mega-Pipeline, integrating structured dictionary databases (MSSQL) with unstructured regional texts to improve grammar and stylistic intelligence.
Data Source :
The dictionary content is freely available as Unified Dictionary project on TransLiteral Foundation's website. The website provides 1,153,927 Words and their 2,309,309 Meanings from 71 dictionaries. These are cited with over 1079 literary sources from several authors from ancient Indian regional and religious texts. The source is used under Creative Commons - ShareALike International License.
Intended Use
- Dictionary Lookups: Providing high-accuracy definitions and etymologies.
- Morphological Analysis: Breaking down complex Sanskrit/Indic root words.
- Regional Translation: Translating word concepts across Marathi, Hindi, and English.
Training Hyperparameters
The following hyperparameters were used during training:
- Engine: MLX
- Learning Rate: 2e-05
- Batch Size: 1
- Gradient Accumulation: 64
- Optimizer: adamw_torch
- LR Scheduler: cosine
- LoRA R: 32
- LoRA Alpha: 16
- Max Sequence Length: 1024
Prompt Template
To achieve the intended structured output, use the following prompt format:
<s>[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{query} [/INST]
Inference Example
Using MLX (Apple Silicon)
import mlx_lm
model, tokenizer = mlx_lm.load("AssignArc/TLF-7B-LLM-01")
prompt = "Provide a comprehensive morphological breakdown for: 'Abacus'"
# Use Sarvam/Llama template logic here
response = mlx_lm.generate(model, tokenizer, prompt=prompt)
print(response)
Using Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained("sarvamai/sarvam-1")
model = PeftModel.from_pretrained(base_model, "AssignArc/TLF-7B-LLM-01")
tokenizer = AutoTokenizer.from_pretrained("sarvamai/sarvam-1")
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0]))
Example
Prompt : Define Goddess
2026-03-24 19:10:20,665 - Inference - INFO -
[BASE MODEL]: <end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning
2026-03-24 19:10:20,665 - Inference - INFO - [FINETUNED]: "devi" Def: f. ( -वी ) 1 A female deity, goddess; a woman of the first or second order. f( आ ). A female deity, goddess; a woman of the first or second order. Tags: Feminine.<end_of_turn>
Citation & Credits
- TLF Framework: Architected for Unified Indic LLM Fine-tuning.
- Data Source: Custom Dictionary & Regional Text Corpus.
- MLX-LM - MLX LM is a Python package for generating text and fine-tuning large language models on Apple silicon with MLX.
- Downloads last month
- -
Model tree for assignarc/TLF-7B-LLM-01
Base model
sarvamai/sarvam-1