Instructions to use dvitvaai/pothana-stage-a-plus-225M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use dvitvaai/pothana-stage-a-plus-225M with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="dvitvaai/pothana-stage-a-plus-225M", trust_remote_code=True)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("dvitvaai/pothana-stage-a-plus-225M", trust_remote_code=True, device_map="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use dvitvaai/pothana-stage-a-plus-225M with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "dvitvaai/pothana-stage-a-plus-225M"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dvitvaai/pothana-stage-a-plus-225M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/dvitvaai/pothana-stage-a-plus-225M

SGLang

How to use dvitvaai/pothana-stage-a-plus-225M with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "dvitvaai/pothana-stage-a-plus-225M" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dvitvaai/pothana-stage-a-plus-225M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "dvitvaai/pothana-stage-a-plus-225M" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "dvitvaai/pothana-stage-a-plus-225M",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use dvitvaai/pothana-stage-a-plus-225M with Docker Model Runner:
```
docker model run hf.co/dvitvaai/pothana-stage-a-plus-225M
```

Pothana Stage A+ — 230M Telugu LM with Roman Telugu (Tenglish) Capability

Stage A+ extends dvitvaai/pothana-base-v2-225M with code-mix and Roman Telugu (Tenglish) capabilities. The model can now read and write Telugu in three styles:

Pure Telugu script — same as Base v2
Code-mixed (Telugu + English script) — e.g., "నేను meeting కి వెళ్తున్నాను"
Roman Telugu (Tenglish) — e.g., "naku rendu cinemalu chudaalani undi"

Designed for mobile deployment where Indian users mix scripts freely.

Status: pretrained base model with code-mix capability. Not yet instruction-tuned. Intended as a starting point for retrieval-augmented or instruction fine-tuning.

Quick start

pip install "transformers>=4.40,<5.0" morfessor

⚠️ transformers 5.x is not supported yet. The tokenizers 0.22+ dependency in transformers 5.x has a WordLevel encoding regression that char-fragments our morfessor-segmented input. Pin to a 4.x release. Tested on 4.55–4.57.

from transformers import pipeline

pipe = pipeline(
    "text-generation",
    model="dvitvaai/pothana-stage-a-plus-225M",
    trust_remote_code=True,
)

# Mixed-script input — pipeline handles it directly.
print(pipe("నేను రేపు office ki వెళ్లాలి"))
print(pipe("naku rendu cinemalu chudaalani undi"))

Or with the lower-level API:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "dvitvaai/pothana-stage-a-plus-225M", trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "dvitvaai/pothana-stage-a-plus-225M", trust_remote_code=True,
)

GEN = dict(
    max_new_tokens=80,
    do_sample=True, temperature=0.7, top_p=0.9, repetition_penalty=1.15,
)

# Telugu input — tokenizer runs morfessor v4 segmentation internally.
inputs = tokenizer("నేను రేపు ఆఫీసుకు వెళ్లాలి", return_tensors="pt")
out = model.generate(**inputs, **GEN)
print(tokenizer.decode(out[0], skip_special_tokens=True))

# Roman Telugu input — passes through without segmentation.
inputs = tokenizer("naku rendu cinemalu chudaalani undi", return_tensors="pt")
out = model.generate(**inputs, **GEN)
print(tokenizer.decode(out[0], skip_special_tokens=True))

trust_remote_code=True is required for the custom PothanaForCausalLM (Llama + QK-norm) and the PothanaTokenizer (runs morfessor v4 segmentation on Telugu input and strips @@ continuation prefix at decode).

The morfessor package is required so the tokenizer can segment raw Telugu text the way training did. The morfessor model and supporting files ship with the repo and load automatically. A generation_config.json is also shipped with sane sampling defaults — the model loops badly under greedy decoding (see Limitations).

What's new vs Base v2

	Base v2	Stage A+
Vocab size	47,831	52,831 (+5,000 Roman Telugu word tokens)
Parameters	222M	230M (+8M from new embedding rows)
Telugu capability	✓	✓
Code-mix (Te+En script)	weak	strong
Roman Telugu reading	weak	strong
Roman Telugu writing	weak	moderate
Retrieval-format `<retrieved>` recognition	✓	✓

Training pipeline summary

Base v2 (val=3.16, 49h)
   ↓ Stage A: retrieval-aware continued pretrain (val=3.05, 6h)
   ↓ Resize vocab 47,831 → 52,831 (add top-5K Roman word tokens, smart-init)
   ↓ Stage A+: code-mix continued pretrain (200 steps, ~3.4 epochs on 31M-token codemix corpus)
   ↓ [THIS MODEL]

Stage A+ specifics

Data: 12,258 Telugu chunks rewritten by Gemini 2.0 Flash into three formats:
- codemix_te_en: natural code-mixing (Te-script + En-script)
- codemix_roman: same code-mixing, all-Roman (phone-typed Tenglish)
- telugu_roman: pure Telugu in Roman script
- Plus 10% original Telugu (anti-forgetting buffer)
Total tokens: 31.2M (focused continued pretrain — small but enough)
Tokenizer extension: top-5K most-frequent Roman Telugu words promoted from BPE-fallback to direct tokens (~33% compression on Roman content)
Training: B200, effective batch 128 × seq 4096, LR 2e-5, WSD schedule, ~20 min wall time

Tokenizer

morfessor_bpe_telugu_v4-v6:

47,831 v4 tokens (morfessor Telugu morphemes + BPE merges for non-Telugu)
+5,000 Roman Telugu word tokens (e.g., nunchi, prabhutvam, kosam, mukhyamantri)
9 retrieval special tokens (IDs 47822–47830, unused for now)
Total: 52,831 tokens

The top-5K Roman tokens give massive compression: a Roman word like prabhutvam (was 5 BPE subwords) → now 1 token.

Tokenizer fertility

Telugu (segmented): same as v4
English: 1.81 tokens/word (unchanged)
Roman Telugu (Tenglish): ~2.5 tokens/word on common forms (vs ~5 with pure BPE before)

Architecture


Parameters	230M unique (378M on disk due to weight-sharing unroll)
Hidden size	768
Layers (unique)	24
Layers (effective with weight sharing)	48
Attention	GQA 16Q / 4KV, head_dim 48
MLP	SwiGLU, intermediate 2048
Norm	RMSNorm (eps=1e-6)
Position	RoPE, θ=500,000
QK-norm	yes
Tied embeddings	no
Vocab	52,831
Max context	4,096

What this model is good at

Reading code-mixed Telugu — handles "నేను meeting కి వెళ్తున్నాను" naturally
Reading Roman Telugu — handles "naku meeting undi" via direct tokens for common words
Generating coherent Telugu prose — short-to-medium length news/literature-style output
Generating natural code-mixed Telugu — mixes English nouns into Telugu sentences

Limitations

Loops at low temperature — like most 225M base models, gets stuck in repetition with greedy / low-temp sampling. Use temp=0.7+ and repetition_penalty=1.15 for cleaner output (shipped as defaults in generation_config.json).
Roman Telugu input is partially <unk>-prone. Only the top-5K most-frequent Roman words are direct vocab entries; the HF WordLevel tokenizer used here has no BPE fallback, so less-common Roman forms (e.g. naku, cinemalu, chudaalani) encode as <unk> and lose their content. Telugu-script and code-mixed Te+En script inputs work cleanly. A future tokenizer rebuild with BPE fallback will fix this.
Roman Telugu generation is weaker than reading — model produces fragmented Roman output even though it reads cleanly. Will improve with Stage B SFT (planned).
Retrieval grounding is NOT yet trained — model accepts <retrieved>...</retrieved> format from Stage A, but doesn't yet condition answers on retrieved content. This is intentional: grounded retrieval is taught at Stage B (SFT on synthetic traces).
No instruction tuning — base model only. Zero-shot prompts get continuation-style outputs, not Q&A behavior.
Factual coverage limited to Sangraha corpus (general Telugu web/news) + 8.8% English Wikipedia from Base v2.

Intended use

Starting point for downstream work:

Retrieval-augmented fine-tuning — the natural next step (Stage B)
Telugu / Tenglish instruction tuning — possible with appropriate dataset
Telugu text classification, NER, summarization — fine-tune with task data
Research on small-scale Telugu language modeling

Evaluation

Stage A val_loss: 3.05 (on retrieval-mixed corpus)
Stage A+ best_val_loss: ~3.0 (codemix corpus, 3.4 epochs)

External benchmarks (IndicGLUE, TyDi-QA-Telugu) have not been run yet.

Citation

@misc{pothana-stage-a-plus-225M,
  title  = {Pothana Stage A+: A 230M Telugu LM with Roman Telugu and code-mix capability},
  author = {Katrapati, Ganesh},
  year   = {2026},
  howpublished = {\url{https://huggingface.co/dvitvaai/pothana-stage-a-plus-225M}},
}

Acknowledgments

Base model: dvitvaai/pothana-base-v2-225M
Codemix synthetic data: Gemini 2.0 Flash
Telugu corpus: AI4Bharat Sangraha

License

Apache 2.0.

Downloads last month: 13

Safetensors

Model size

0.4B params

Tensor type

F32

Model tree for dvitvaai/pothana-stage-a-plus-225M

Base model

dvitvaai/pothana-base-v2-225M

Finetuned

(1)

this model