akshara-ml / README.md
endurasolution's picture
Upload folder using huggingface_hub
8faa42b verified
# 🅰️ Akshara-ML — Malayalam Transliteration Model
**Akshara-ML** is a neural transliteration model that converts **Manglish (Romanized Malayalam)** into **Malayalam script**.
Developed by **EnduraSolution**, in association with **Aksharakuppy**.
🌐 https://aksharakuppy.com
---
[![Hugging Face](https://img.shields.io/badge/HuggingFace-Model-yellow)](https://huggingface.co/endurasolution/akshara-ml)
## ✨ Features
- 🔤 Manglish → Malayalam transliteration
- ⚡ Fast inference (greedy decoding)
- 🎯 High accuracy (beam search decoding)
- 🧠 Transformer-based architecture
- 🇮🇳 Built specifically for Malayalam language
---
## 🧪 Example
| Manglish | Malayalam |
|--------|----------|
| namaskaram | നമസ്കാരം |
| sugam aano | സുഖം ആണോ |
| ente peru | എന്റെ പേര് |
---
## 🚀 Usage (Python)
```python
from model import build_model
from train import load_checkpoint
from dataset import load_vocab, get_inverse_vocab
from config import Config
import torch
# Load vocab
src_vocab = load_vocab("src_vocab.json")
tgt_vocab = load_vocab("tgt_vocab.json")
inv_vocab = get_inverse_vocab(tgt_vocab)
# Build model
model = build_model(len(src_vocab), len(tgt_vocab))
load_checkpoint("pytorch_model.bin", model)
model.eval()
def transliterate(text):
ids = [Config.SOS_IDX] + [src_vocab.get(c, Config.UNK_IDX) for c in text] + [Config.EOS_IDX]
src = torch.tensor([ids])
pred_ids = model.greedy_decode(src)
output = ""
for i in pred_ids:
if i == Config.EOS_IDX:
break
output += inv_vocab.get(i, "")
return output
print(transliterate("namaskaram"))