Instructions to use dvitvaai/pothana-sp-base-300M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use dvitvaai/pothana-sp-base-300M with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="dvitvaai/pothana-sp-base-300M")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("dvitvaai/pothana-sp-base-300M")
model = AutoModelForCausalLM.from_pretrained("dvitvaai/pothana-sp-base-300M")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use dvitvaai/pothana-sp-base-300M with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "dvitvaai/pothana-sp-base-300M"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dvitvaai/pothana-sp-base-300M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/dvitvaai/pothana-sp-base-300M

SGLang

How to use dvitvaai/pothana-sp-base-300M with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "dvitvaai/pothana-sp-base-300M" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dvitvaai/pothana-sp-base-300M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "dvitvaai/pothana-sp-base-300M" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dvitvaai/pothana-sp-base-300M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use dvitvaai/pothana-sp-base-300M with Docker Model Runner:
```
docker model run hf.co/dvitvaai/pothana-sp-base-300M
```

Pothana Base 300M

A 387M parameter LLaMA-style language model trained from scratch on Telugu text.

Named after Bammera Pothana, the celebrated 15th-century Telugu poet who authored the Andhra Maha Bhagavatamu.

Developed by Dvitva AI.

Model Details


Model	pothana-base-300M
Architecture	LLaMA (RoPE + SwiGLU + RMSNorm + GQA)
Parameters	387M (unique)
Hidden size	1024
Layers	30 unique (60 effective via weight sharing)
Attention heads	16 Q / 4 KV (Grouped Query Attention)
Intermediate size	2816
Context length	2048
Vocab size	48,000
Tokenizer	SentencePiece Unigram (48K)
Training	Single GPU, bf16 mixed precision
Developed by	Dvitva AI

Quick Start

Using pipeline

from transformers import pipeline

pipe = pipeline("text-generation", model="dvitvaai/pothana-base-300M", trust_remote_code=True)
result = pipe("తెలుగు భాష", max_new_tokens=50, do_sample=True, temperature=0.8)
print(result[0]["generated_text"])

Note: trust_remote_code=True is required for the custom tokenizer that cleans up SentencePiece word boundary markers for readable output.

Manual loading

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained("dvitvaai/pothana-base-300M")
tokenizer = AutoTokenizer.from_pretrained("dvitvaai/pothana-base-300M", trust_remote_code=True)

text = "తెలుగు భాష చాలా అందమైనది"
inputs = tokenizer(text, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=100,
        temperature=0.8,
        top_k=50,
        do_sample=True,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Tokenizer

This model uses a SentencePiece Unigram tokenizer with a 48K vocabulary, trained directly on Telugu text.

Handles raw Telugu text directly (no preprocessing needed)
Byte-fallback for out-of-vocabulary characters
Split digits for better number handling
NFKC normalization

Architecture

Key features:

Grouped Query Attention (GQA): 16 query heads, 4 KV heads — 4x KV cache reduction
Block-wise Weight Sharing: 30 unique blocks, each used twice = 60 effective layers (MobileLLM-LS)
SwiGLU MLP with 2816 intermediate size
RoPE positional encoding (theta=10000.0)
RMSNorm (no bias in any linear layer)

Training

Data: Telugu text corpus (Sangraha dataset)
Preprocessing: SentencePiece tokenization (raw text)
Optimizer: AdamW (lr=3e-4, weight_decay=0.1, beta1=0.9, beta2=0.95)
Schedule: WSD (Warmup-Stable-Decay)
Precision: bf16 mixed precision
Hardware: Single NVIDIA B200 GPU

Limitations

This is a base model (not instruction-tuned) — it performs text completion, not instruction following
Trained primarily on Telugu text; limited multilingual capability
Small model size (387M) limits reasoning and knowledge capacity

License

Apache 2.0

Citation

If you use this model, please cite:

@misc{pothana-base-300M,
  title={Pothana Base 300M: A Telugu Language Model},
  author={Dvitva AI},
  year={2025},
  url={https://huggingface.co/dvitvaai/pothana-base-300M}
}

Downloads last month: 9