| --- |
| license: mit |
| language: |
| - sa |
| - en |
| tags: |
| - sanskrit |
| - paraphrase |
| - diffusion |
| - d3pm |
| - pytorch |
| pipeline_tag: text2text-generation |
| --- |
| |
| # Sanskrit D3PM Paraphrase Model |
|
|
| Roman/IAST Sanskrit input to Devanagari output using a D3PM cross-attention model. |
|
|
| ## Files Included |
|
|
| - `best_model.pt` — trained checkpoint |
| - `config.py` — runtime config |
| - `inference.py` — model loading + generation loop |
| - `inference_api.py` — simple Python API (`predict`) |
| - `handler.py` — Hugging Face Endpoint handler |
| - `model/`, `diffusion/` — architecture modules |
| - `sanskrit_src_tokenizer.json`, `sanskrit_tgt_tokenizer.json` — tokenizers |
|
|
| ## Quick Local Test |
|
|
| ```python |
| from inference_api import predict |
| print(predict("dharmo rakṣati rakṣitaḥ")["output"]) |
| ``` |
|
|
| ## Transformer-Style Usage (Custom Runtime) |
|
|
| This checkpoint is a custom D3PM architecture (`.pt`), not a native `transformers` `AutoModel` format. |
| Use it in a transformer-like way via the provided runtime: |
|
|
| ```python |
| import torch |
| from config import CONFIG |
| from inference import load_model, run_inference, _decode_clean |
| from model.tokenizer import SanskritSourceTokenizer, SanskritTargetTokenizer |
| |
| device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
| model, cfg = load_model("best_model.pt", CONFIG, device) |
| |
| src_tok = SanskritSourceTokenizer(vocab_size=16000, max_len=cfg["model"]["max_seq_len"]) |
| tgt_tok = SanskritTargetTokenizer(vocab_size=16000, max_len=cfg["model"]["max_seq_len"]) |
| |
| text = "dharmo rakṣati rakṣitaḥ" |
| ids = torch.tensor([src_tok.encode(text)], dtype=torch.long, device=device) |
| out = run_inference(model, ids, cfg) |
| print(_decode_clean(tgt_tok, out[0].tolist())) |
| ``` |
|
|
| If you need full `transformers` compatibility (`AutoModel.from_pretrained`), export weights to a Hugging Face Transformers model format first. |
|
|
| ## Endpoint Payload |
|
|
| ```json |
| { |
| "inputs": "yadā mano nivarteta viṣayebhyaḥ svabhāvataḥ", |
| "parameters": { |
| "temperature": 0.7, |
| "top_k": 40, |
| "repetition_penalty": 1.2, |
| "diversity_penalty": 0.0, |
| "num_steps": 64, |
| "clean_output": true |
| } |
| } |
| ``` |
|
|
| ## Push This Folder To Model Hub |
|
|
| ```bash |
| huggingface-cli login |
| huggingface-cli repo create <your-username>/sanskrit-d3pm --type model |
| cd hf_model_repo |
| git init |
| git lfs install |
| git remote add origin https://huggingface.co/<your-username>/sanskrit-d3pm |
| git add . |
| git commit -m "Initial model release" |
| git push -u origin main |
| ``` |
|
|