TLF-7B-LLM-01

Model Description

This model is a fine-tuned version of sarvamai/sarvam-1 specialized for Bilingual Indic Lexicography.

It has been trained to provide structured morphological breakdowns, definitions, and regional translations for Sanskrit and other Indian regional languages.

The training data was ingested through the TLF Mega-Pipeline, integrating structured dictionary databases (MSSQL) with unstructured regional texts to improve grammar and stylistic intelligence.

Data Source :

The dictionary content is freely available as Unified Dictionary project on TransLiteral Foundation's website. The website provides 1,153,927 Words and their 2,309,309 Meanings from 71 dictionaries. These are cited with over 1079 literary sources from several authors from ancient Indian regional and religious texts. The source is used under Creative Commons - ShareALike International License.

Intended Use

  • Dictionary Lookups: Providing high-accuracy definitions and etymologies.
  • Morphological Analysis: Breaking down complex Sanskrit/Indic root words.
  • Regional Translation: Translating word concepts across Marathi, Hindi, and English.

Training Hyperparameters

The following hyperparameters were used during training:

  • Engine: MLX
  • Learning Rate: 2e-05
  • Batch Size: 1
  • Gradient Accumulation: 64
  • Optimizer: adamw_torch
  • LR Scheduler: cosine
  • LoRA R: 32
  • LoRA Alpha: 16
  • Max Sequence Length: 1024

Prompt Template

To achieve the intended structured output, use the following prompt format:

<s>[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{query} [/INST]

Inference Example

Using MLX (Apple Silicon)

import mlx_lm
model, tokenizer = mlx_lm.load("AssignArc/TLF-7B-LLM-01")

prompt = "Provide a comprehensive morphological breakdown for: 'Abacus'"
# Use Sarvam/Llama template logic here
response = mlx_lm.generate(model, tokenizer, prompt=prompt)
print(response)

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("sarvamai/sarvam-1")
model = PeftModel.from_pretrained(base_model, "AssignArc/TLF-7B-LLM-01")
tokenizer = AutoTokenizer.from_pretrained("sarvamai/sarvam-1")

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0]))

Example

Prompt : Define Goddess 

2026-03-24 19:10:20,665 - Inference - INFO - 
[BASE MODEL]: <end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning goddess.
<end_of_turn>model <start of turn>: devi is a feminine noun, meaning
2026-03-24 19:10:20,665 - Inference - INFO - [FINETUNED]: "devi" Def: f. ( -वी ) 1 A female deity, goddess; a woman of the first or second order. f( आ ). A female deity, goddess; a woman of the first or second order. Tags: Feminine.<end_of_turn>

Citation & Credits

  • TLF Framework: Architected for Unified Indic LLM Fine-tuning.
  • Data Source: Custom Dictionary & Regional Text Corpus.
  • MLX-LM - MLX LM is a Python package for generating text and fine-tuning large language models on Apple silicon with MLX.
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for assignarc/TLF-7B-LLM-01

Adapter
(21)
this model