Text Generation
Transformers
Safetensors
Telugu
llama
telugu
causal-lm
sentencepiece
gqa
weight-sharing
from-scratch
text-generation-inference
Instructions to use dvitvaai/pothana-sp-base-300M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dvitvaai/pothana-sp-base-300M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="dvitvaai/pothana-sp-base-300M")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("dvitvaai/pothana-sp-base-300M") model = AutoModelForCausalLM.from_pretrained("dvitvaai/pothana-sp-base-300M") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use dvitvaai/pothana-sp-base-300M with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "dvitvaai/pothana-sp-base-300M" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dvitvaai/pothana-sp-base-300M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/dvitvaai/pothana-sp-base-300M
- SGLang
How to use dvitvaai/pothana-sp-base-300M with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "dvitvaai/pothana-sp-base-300M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dvitvaai/pothana-sp-base-300M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "dvitvaai/pothana-sp-base-300M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "dvitvaai/pothana-sp-base-300M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use dvitvaai/pothana-sp-base-300M with Docker Model Runner:
docker model run hf.co/dvitvaai/pothana-sp-base-300M
Pothana Base 300M
A 387M parameter LLaMA-style language model trained from scratch on Telugu text.
Named after Bammera Pothana, the celebrated 15th-century Telugu poet who authored the Andhra Maha Bhagavatamu.
Developed by Dvitva AI.
Model Details
| Model | pothana-base-300M |
| Architecture | LLaMA (RoPE + SwiGLU + RMSNorm + GQA) |
| Parameters | 387M (unique) |
| Hidden size | 1024 |
| Layers | 30 unique (60 effective via weight sharing) |
| Attention heads | 16 Q / 4 KV (Grouped Query Attention) |
| Intermediate size | 2816 |
| Context length | 2048 |
| Vocab size | 48,000 |
| Tokenizer | SentencePiece Unigram (48K) |
| Training | Single GPU, bf16 mixed precision |
| Developed by | Dvitva AI |
Quick Start
Using pipeline
from transformers import pipeline
pipe = pipeline("text-generation", model="dvitvaai/pothana-base-300M", trust_remote_code=True)
result = pipe("తెలుగు భాష", max_new_tokens=50, do_sample=True, temperature=0.8)
print(result[0]["generated_text"])
Note:
trust_remote_code=Trueis required for the custom tokenizer that cleans up SentencePiece word boundary markers for readable output.
Manual loading
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained("dvitvaai/pothana-base-300M")
tokenizer = AutoTokenizer.from_pretrained("dvitvaai/pothana-base-300M", trust_remote_code=True)
text = "తెలుగు భాష చాలా అందమైనది"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=100,
temperature=0.8,
top_k=50,
do_sample=True,
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Tokenizer
This model uses a SentencePiece Unigram tokenizer with a 48K vocabulary, trained directly on Telugu text.
- Handles raw Telugu text directly (no preprocessing needed)
- Byte-fallback for out-of-vocabulary characters
- Split digits for better number handling
- NFKC normalization
Architecture
Key features:
- Grouped Query Attention (GQA): 16 query heads, 4 KV heads — 4x KV cache reduction
- Block-wise Weight Sharing: 30 unique blocks, each used twice = 60 effective layers (MobileLLM-LS)
- SwiGLU MLP with 2816 intermediate size
- RoPE positional encoding (theta=10000.0)
- RMSNorm (no bias in any linear layer)
Training
- Data: Telugu text corpus (Sangraha dataset)
- Preprocessing: SentencePiece tokenization (raw text)
- Optimizer: AdamW (lr=3e-4, weight_decay=0.1, beta1=0.9, beta2=0.95)
- Schedule: WSD (Warmup-Stable-Decay)
- Precision: bf16 mixed precision
- Hardware: Single NVIDIA B200 GPU
Limitations
- This is a base model (not instruction-tuned) — it performs text completion, not instruction following
- Trained primarily on Telugu text; limited multilingual capability
- Small model size (387M) limits reasoning and knowledge capacity
License
Apache 2.0
Citation
If you use this model, please cite:
@misc{pothana-base-300M,
title={Pothana Base 300M: A Telugu Language Model},
author={Dvitva AI},
year={2025},
url={https://huggingface.co/dvitvaai/pothana-base-300M}
}
- Downloads last month
- 3