File size: 1,677 Bytes
8faa42b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# 🅰️ Akshara-ML — Malayalam Transliteration Model

**Akshara-ML** is a neural transliteration model that converts **Manglish (Romanized Malayalam)** into **Malayalam script**.

Developed by **EnduraSolution**, in association with **Aksharakuppy**.

🌐 https://aksharakuppy.com

---
[![Hugging Face](https://img.shields.io/badge/HuggingFace-Model-yellow)](https://huggingface.co/endurasolution/akshara-ml)
## ✨ Features

- 🔤 Manglish → Malayalam transliteration
- ⚡ Fast inference (greedy decoding)
- 🎯 High accuracy (beam search decoding)
- 🧠 Transformer-based architecture
- 🇮🇳 Built specifically for Malayalam language

---

## 🧪 Example

| Manglish | Malayalam |
|--------|----------|
| namaskaram | നമസ്കാരം |
| sugam aano | സുഖം ആണോ |
| ente peru | എന്റെ പേര് |

---

## 🚀 Usage (Python)

```python
from model import build_model
from train import load_checkpoint
from dataset import load_vocab, get_inverse_vocab
from config import Config
import torch

# Load vocab
src_vocab = load_vocab("src_vocab.json")
tgt_vocab = load_vocab("tgt_vocab.json")
inv_vocab = get_inverse_vocab(tgt_vocab)

# Build model
model = build_model(len(src_vocab), len(tgt_vocab))
load_checkpoint("pytorch_model.bin", model)
model.eval()

def transliterate(text):
    ids = [Config.SOS_IDX] + [src_vocab.get(c, Config.UNK_IDX) for c in text] + [Config.EOS_IDX]
    src = torch.tensor([ids])
    pred_ids = model.greedy_decode(src)

    output = ""
    for i in pred_ids:
        if i == Config.EOS_IDX:
            break
        output += inv_vocab.get(i, "")
    return output

print(transliterate("namaskaram"))