Instructions to use dvitvaai/pothana-base-300M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use dvitvaai/pothana-base-300M with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="dvitvaai/pothana-base-300M")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("dvitvaai/pothana-base-300M")
model = AutoModelForCausalLM.from_pretrained("dvitvaai/pothana-base-300M")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use dvitvaai/pothana-base-300M with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "dvitvaai/pothana-base-300M"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dvitvaai/pothana-base-300M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/dvitvaai/pothana-base-300M

SGLang

How to use dvitvaai/pothana-base-300M with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "dvitvaai/pothana-base-300M" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dvitvaai/pothana-base-300M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "dvitvaai/pothana-base-300M" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dvitvaai/pothana-base-300M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use dvitvaai/pothana-base-300M with Docker Model Runner:
```
docker model run hf.co/dvitvaai/pothana-base-300M
```

Pothana Base 300M

A 345M parameter LLaMA-style language model trained from scratch on Telugu text.

Named after Bammera Pothana, the celebrated 15th-century Telugu poet who authored the Andhra Maha Bhagavatamu.

Developed by Dvitva AI.

Model Details


Model	pothana-base-300M
Architecture	LLaMA (RoPE + SwiGLU + RMSNorm)
Parameters	345M
Hidden size	1024
Layers	20
Attention heads	16
Intermediate size	2816
Context length	2048
Vocab size	86,071
Tokenizer	Morfessor + BPE (Telugu morpheme-aware)
Training	Single GPU, bf16 mixed precision
Developed by	Dvitva AI

Quick Start

Using pipeline

from transformers import pipeline

pipe = pipeline("text-generation", model="dvitvaai/pothana-base-300M", trust_remote_code=True)
result = pipe("తెలుగు భాష", max_new_tokens=50, do_sample=True, temperature=0.8)
print(result[0]["generated_text"])

Note: trust_remote_code=True is required for the custom tokenizer that handles @@ morpheme joining. Without it, @@ markers will appear in the output.

Manual loading

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained("dvitvaai/pothana-base-300M")
tokenizer = AutoTokenizer.from_pretrained("dvitvaai/pothana-base-300M", trust_remote_code=True)

# Input must be Morfessor-segmented (with @@ continuation markers)
segmented_text = "తెలుగు భాష చాలా అందమైన@@ ది"
inputs = tokenizer(segmented_text, return_tensors="pt")

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=100,
        temperature=0.8,
        top_k=50,
        do_sample=True,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Tokenizer

This model uses a Morfessor + BPE hybrid tokenizer designed for Telugu:

Telugu text: Segmented into morphemes using Morfessor with @@ continuation markers
Non-Telugu text (English, numbers, URLs): Handled by BPE subword encoding
Fallback: Character-level encoding for out-of-vocabulary tokens

Important: The tokenizer expects pre-segmented input (with @@ markers). For raw Telugu text, you need to run Morfessor segmentation first.

Full pipeline (raw Telugu text)

For raw Telugu text, segment with Morfessor first:

import morfessor

# Load Morfessor model
io = morfessor.MorfessorIO()
morf_model = io.read_binary_model_file("morfessor_telugu.bin")

def segment_telugu(text, separator="@@"):
    import re
    TELUGU_RE = re.compile(r"[\u0C00-\u0C7F]+")
    tokens = []
    for word in text.split():
        if TELUGU_RE.fullmatch(word):
            segments = morf_model.viterbi_segment(word)[0]
            for i, seg in enumerate(segments):
                tokens.append(seg + separator if i < len(segments) - 1 else seg)
        else:
            tokens.append(word)
    return " ".join(tokens)

# Segment, then tokenize and generate
raw_text = "తెలుగు భాష చాలా అందమైనది"
segmented = segment_telugu(raw_text)
inputs = tokenizer(segmented, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training

Data: Telugu text corpus (Sangraha dataset)
Preprocessing: Morfessor morpheme segmentation + BPE for non-Telugu
Optimizer: AdamW (lr=3e-4, weight_decay=0.1, beta1=0.9, beta2=0.95)
Schedule: Cosine LR decay with 500-step warmup
Precision: bf16 mixed precision
Hardware: Single GPU

Limitations

This is a base model (not instruction-tuned) — it performs text completion, not instruction following
The tokenizer requires Morfessor-segmented input for best results
Trained primarily on Telugu text; limited multilingual capability
Small model size (345M) limits reasoning and knowledge capacity

License

Apache 2.0

Citation

If you use this model, please cite:

@misc{pothana-base-300M,
  title={Pothana Base 300M: A Telugu Language Model},
  author={Dvitva AI},
  year={2025},
  url={https://huggingface.co/dvitvaai/pothana-base-300M}
}

Downloads last month: 9

Model tree for dvitvaai/pothana-base-300M

Finetunes

1 model

dvitvaai
/

pothana-base-300M