Instructions to use Harisundar/PALL-Text with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Harisundar/PALL-Text with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Harisundar/PALL-Text")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Harisundar/PALL-Text")
model = AutoModelForCausalLM.from_pretrained("Harisundar/PALL-Text", device_map="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Harisundar/PALL-Text with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Harisundar/PALL-Text"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Harisundar/PALL-Text",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Harisundar/PALL-Text

SGLang

How to use Harisundar/PALL-Text with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Harisundar/PALL-Text" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Harisundar/PALL-Text",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Harisundar/PALL-Text" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Harisundar/PALL-Text",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Unsloth Studio

How to use Harisundar/PALL-Text with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Harisundar/PALL-Text to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Harisundar/PALL-Text to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Harisundar/PALL-Text to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="Harisundar/PALL-Text",
    max_seq_length=2048,
)

Docker Model Runner
How to use Harisundar/PALL-Text with Docker Model Runner:
```
docker model run hf.co/Harisundar/PALL-Text
```

PALL-Text — A Dental-Domain Llama-3.1-8B

PALL-Text is a dental-domain–specialized large language model, adapted from Llama-3.1-8B through a three-stage post-training pipeline — Continued Pre-Training (CPT) → Supervised Fine-Tuning (SFT) → Direct Preference Optimization (DPO) — run end-to-end under 4-bit QLoRA on a single A100-40GB GPU for roughly $20 of cloud compute.

This repository hosts the final, fully-merged bf16 model (all three adapters merged into the base weights — no PEFT adapter required at inference).

Developed by: Harisundar R
Base model: unsloth/Meta-Llama-3.1-8B
Code: PALL on GitHub
Companion VLM: Harisundar/PALL-VLM
Training data: Harisundar/pall
Language: English
License: Llama 3.1 Community License

Model description

Frontier LLMs remain clinically uneven on dental tasks, while specialized dental models are typically closed-weight and require multi-GPU clusters. PALL closes this gap with an open, single-GPU, reproducible recipe. The contribution is integration, not a new algorithm: established parameter-efficient techniques combined into one affordable pipeline and applied to dentistry — including preference tuning for clinical safety, which the dental-LLM literature otherwise lacks.

Architecture

LlamaForCausalLM, 8.03B parameters, bf16, 32 layers, hidden size 4096, vocab 128,256.
Grouped-Query Attention (32 query / 8 KV heads), RoPE, RMSNorm, SwiGLU MLP.

Training pipeline

Stage	Objective	Data	Key config	Eval
CPT	inject dental knowledge	~175M-token dental corpus	r=64, α=128, lr 2e-4, 1 epoch	eval loss 1.684 (ppl ≈ 5.39)
SFT	instruction following	~392K Q&A pairs (loss on assistant tokens)	lr 1e-4, 2 epochs, eff. batch 48	eval loss 1.292
DPO	safety / preference alignment	~10.7K preference triplets	β=0.1, lr 2e-6, 1 epoch	eval loss 0.0138, 99.5% pref. acc.

Efficiency stack

QLoRA (4-bit NF4 + LoRA r=64/α=128 on all 7 projections), FlashAttention-2, Unsloth fused kernels, and paged AdamW 8-bit — all in bf16 with gradient checkpointing. Peak VRAM stays under 40 GB throughout.

Results

Held-out dental benchmark (250 MCQ · 250 oral-disease open-QA · 500 dental-forum open-QA):

Stage	Dental MCQ	Note
Baseline Llama-3.1-8B	56.0%	—
+ CPT	4.0%	format collapse (knowledge gained, not lost)
+ SFT	58.0%	best MCQ — knowledge becomes accessible
+ DPO (this model)	48.8%	trades exam rigidity for open-ended quality

DPO's gains land on the deployment-relevant axis (oral-disease open-QA, 1–5 judge scale): correctness 3.78 → 4.61, clarity 3.84 → 4.78, hedging/safety 4.82 → 4.97, and the failure rate (runaway generation / meta-narration) drops from ~100% to 16%.

Intended use

Intended: dental education, clinical-knowledge Q&A, patient-communication drafting, and as an open backbone for further dental fine-tuning or the multimodal PALL-VLM.
Out of scope: autonomous diagnosis or treatment decisions; non-dental medical advice; any use without qualified clinician oversight.

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Harisundar/PALL-Text"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

messages = [
    {"role": "system", "content": "You are a careful dental clinical assistant."},
    {"role": "user", "content": "What is the recommended management for irreversible pulpitis?"},
]
inputs = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
out = model.generate(inputs, max_new_tokens=400, do_sample=False)
print(tok.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))

Training Data Sources & Acknowledgements

PALL is trained on data assembled from many publicly available sources. We gratefully acknowledge the creators of these datasets. Row counts are from our dental-filtered subsets; the original datasets may be larger.

CPT — Continued Pre-Training corpus (~406K documents, ~175M tokens)

Source	Rows	Attribution
OpenAlex dental works	199,518	Priem, J., Piwowar, H., & Orr, R. (2022). OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts. arXiv:2205.01833
PubMed dental abstracts	95,436	U.S. National Library of Medicine / NCBI
PMC open-access dental full text	49,895	U.S. National Library of Medicine / NCBI
Dental textbooks (cleaned)	32,923	Various authors (see DPO sources below)
ClinicalTrials.gov dental studies	15,379	U.S. National Library of Medicine
Wikipedia dental articles	11,634	Wikimedia Foundation (CC BY-SA)
HuggingFace dental extras	1,098	Community datasets

SFT — Supervised Fine-Tuning (~412K instruction pairs)

Source	Rows	Attribution
`OpenMed/Medical-Reasoning-SFT-Mega`	24,550	OpenMed team
PubMedQA dental (artificial subset)	20,662	Jin, Q., Dhingra, B., Liu, Z., Cohen, W.W., & Lu, X. (2019). PubMedQA: A Dataset for Biomedical Research Question Answering. EMNLP 2019
`ibivibiv/medical_instruct_en`	19,336	ibivibiv (HuggingFace)
`FreedomIntelligence/ApolloCorpus`	18,016	Wang, X. et al. (2024). Apollo: A Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People. arXiv:2403.03640
ChatDoctor HealthcareMagic dental	12,423	Li, Y. et al. (2023). ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge. Cureus 15(6)
`exafluence/Open-MedQA-Nexus`	11,608	Exafluence team
`accolade2025/dental-llama2-35k`	10,197	accolade2025 (HuggingFace)
`electricsheepafrica/oral-health-dental-disease`	10,000	electricsheepafrica (HuggingFace)
`Intelligent-Internet/II-Medical-Reasoning-SFT`	9,738	Intelligent Internet team
`ruslanmv/ai-medical-chatbot`	9,681	ruslanmv (HuggingFace)
`AGBonnet/augmented-clinical-notes`	16,265	AGBonnet (HuggingFace)
`FremyCompany/Asclepius-Synthetic-Clinical-Notes-QA-EN`	8,138	Kweon, S. et al. (2023). Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes. arXiv:2309.00237
MedMCQA dental subset	6,315	Pal, A., Umapathi, L.K., & Sankarasubbu, M. (2022). MedMCQA: A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering. CHIL 2022
`Mxode/Chinese-Medical-Instruct-1M`	6,528	Mxode (HuggingFace)
`miriad/miriad-4.4M`	5,995	Zheng, Q. et al. (2025). MIRIAD: Augmenting LLMs with millions of medical query-response pairs. arXiv:2506.06091
HuatuoGPT2-SFT (GPT-4 generated)	4,977	Chen, J. et al. (2023). HuatuoGPT-II: One-stage Training for Medical Adaption of LLMs. arXiv:2311.09774
`BrainHealthAI/MedQA_mutilangual`	3,970	BrainHealthAI (HuggingFace); originally Jin, D. et al. (2021). What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams. Applied Sciences
Dental forum / Jonathan Kang dental	2,419	Community dental forums
`Tuminha/dental-evidence-dataset`	1,931	Tuminha (HuggingFace)
`FreedomIntelligence/medical-o1-reasoning-SFT`	2,910	FreedomIntelligence team
`naazimsnh02/dentalgemma-instruct`	1,743	naazimsnh02 (HuggingFace)
`AnonymousSub/MedQuAD_47441_Question_Answer_Pairs`	1,734	Ben Abacha, A. & Demner-Fushman, D. (2019). A Question-Entailment Approach to Question Answering. BMC Bioinformatics
`HPAI-BSC/Aloe-Beta-Medical-Collection`	1,633	Gururajan, A.K. et al. (2024). Aloe: A Family of Fine-tuned Open Healthcare LLMs. arXiv:2405.01886
`saidonepudi8/dental_tc`	1,149	saidonepudi8 (HuggingFace)
`lavita/AlpaCare-MedInstruct-52k`	1,063	Zhang, X. et al. (2023). AlpaCare: Instruction-tuned Large Language Models for Medical Application. arXiv:2310.14558
Textbook-derived SFT	2,039	Various dental textbook authors

DPO — Direct Preference Optimization (~10.7K preference triplets)

DPO preference pairs were constructed from dental textbook content by scoring candidate responses on a 5-axis rubric (correctness, conciseness, hedging, clarity, safety) and pairing top- with bottom-quartile responses. Source textbooks include works by:

Hargreaves, K.M. — Cohen's Pathways of the Pulp (9th Ed.)
Humes, H.D. — Kelley's Textbook of Internal Medicine (4th Ed.)
Harvey, A. — Lippincott's Illustrated Biochemistry (5th Ed.)
Myers, J. — Oral Cancer Metastasis
Koolman, J. — Color Atlas of Biochemistry (2nd Ed.)
Fonseca, R.J. — Oral and Maxillofacial Surgery (Vol I)
Mehta, N.R. — Head, Face, and Neck Pain
Fejerskov, O. & Kidd, E. — Dental Caries: The Disease and its Clinical Management (2nd Ed.)
Mitchell, D.A. — Oxford Handbook of Clinical Dentistry (6th Ed.)
Regezi, J.A. — Oral Pathology (7th Ed.)
Malamed, S.F. — Sedation: A Guide to Patient Management (5th Ed.)
And 100+ additional dental and medical textbooks

Limitations & risks

May hallucinate or omit; not a substitute for professional clinical judgment.
Strict multiple-choice accuracy is slightly below the SFT checkpoint by design (DPO favors open-ended clinical reasoning).
Conciseness on long, ambiguous patient narratives remains the weakest dimension.
Not evaluated for long-horizon treatment planning, rare pathologies, or adversarial cases.

Citation

If you use PALL, please cite:

@misc{rajendran2026pall,
  title        = {PALL: Post-training Adaptation of Large Language Models at Low Cost
                  for Dental-Domain Specialization},
  author       = {Rajendran, Harisundar},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/Harisundar/PALL-Text}},

}

Foundational works

@inproceedings{hu2022lora,
  title={LoRA: Low-Rank Adaptation of Large Language Models},
  author={Hu, Edward J. and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan
          and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu},
  booktitle={ICLR}, year={2022}
}
@inproceedings{dettmers2023qlora,
  title={QLoRA: Efficient Finetuning of Quantized LLMs},
  author={Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke},
  booktitle={NeurIPS}, year={2023}
}
@inproceedings{rafailov2023dpo,
  title={Direct Preference Optimization: Your Language Model is Secretly a Reward Model},
  author={Rafailov, Rafael and Sharma, Archit and Mitchell, Eric and Ermon, Stefano
          and Manning, Christopher D. and Finn, Chelsea},
  booktitle={NeurIPS}, year={2023}
}
@article{dao2023flashattention2,
  title={FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning},
  author={Dao, Tri}, journal={arXiv:2307.08691}, year={2023}
}
@misc{unsloth2024,
  title={Unsloth: 2x faster, 70\% less memory LLM finetuning},
  author={Han, Daniel and Han, Michael and {Unsloth team}}, year={2024},
  howpublished={\url{https://github.com/unslothai/unsloth}}
}
@article{grattafiori2024llama3,
  title={The Llama 3 Herd of Models},
  author={Grattafiori, Aaron and others}, journal={arXiv:2407.21783}, year={2024}
}

Key dataset citations

@inproceedings{jin2019pubmedqa,
  title={PubMedQA: A Dataset for Biomedical Research Question Answering},
  author={Jin, Qiao and Dhingra, Bhuwan and Liu, Zhengping and Cohen, William and Lu, Xinghua},
  booktitle={EMNLP-IJCNLP}, pages={2567--2577}, year={2019}
}
@InProceedings{pal2022medmcqa,
  title={MedMCQA: A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering},
  author={Pal, Ankit and Umapathi, Logesh Kumar and Sankarasubbu, Malaikannan},
  booktitle={CHIL}, series={PMLR}, volume={174}, pages={248--260}, year={2022}
}
@misc{wang2024apollo,
  title={Apollo: Lightweight Multilingual Medical LLMs towards Democratizing Medical AI to 6B People},
  author={Xidong Wang and Nuo Chen and Junyin Chen and Yan Hu and Yidong Wang and Xiangbo Wu
          and Anningzhe Gao and Xiang Wan and Haizhou Li and Benyou Wang},
  eprint={2403.03640}, archivePrefix={arXiv}, year={2024}
}
@article{li2023chatdoctor,
  title={ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge},
  author={Li, Yunxiang and Li, Zihan and Zhang, Kai and Dan, Ruilong and Jiang, Steve and Zhang, You},
  journal={Cureus}, volume={15}, number={6}, year={2023}, doi={10.7759/cureus.40895}
}
@article{chen2023huatuogpt2,
  title={HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs},
  author={Chen, Junying and Wang, Xidong and Gao, Anningzhe and Jiang, Feng and Chen, Shunian
          and Zhang, Hongbo and Song, Dingjie and Xie, Wenya and Kong, Chuyi and Li, Jianquan
          and Wan, Xiang and Li, Haizhou and Wang, Benyou},
  journal={arXiv preprint arXiv:2311.09774}, year={2023}
}
@misc{kweon2023asclepius,
  title={Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes},
  author={Sunjun Kweon and Junu Kim and Jiyoun Kim and Sujeong Im and Eunbyeol Cho and Seongsu Bae
          and Jungwoo Oh and Gyubok Lee and Jong Hak Moon and Seng Chan You and Seungjin Baek
          and Chang Hoon Han and Yoon Bin Jung and Yohan Jo and Edward Choi},
  eprint={2309.00237}, archivePrefix={arXiv}, year={2023}
}
@misc{zhang2023alpacare,
  title={AlpaCare: Instruction-tuned Large Language Models for Medical Application},
  author={Xinlu Zhang and Chenxin Tian and Xianjun Yang and Lichang Chen and Zekun Li and Linda Ruth Petzold},
  eprint={2310.14558}, archivePrefix={arXiv}, year={2023}
}
@article{benabacha2019medquad,
  title={A Question-Entailment Approach to Question Answering},
  author={Ben Abacha, Asma and Demner-Fushman, Dina},
  journal={BMC Bioinformatics}, volume={20}, number={1}, pages={511:1--511:23}, year={2019}
}
@article{jin2021medqa,
  title={What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams},
  author={Jin, Di and Pan, Eileen and Oufattole, Nassim and Weng, Wei-Hung and Fang, Hanyi and Szolovits, Peter},
  journal={Applied Sciences}, volume={11}, number={14}, pages={6421}, year={2021}, doi={10.3390/app11146421}
}
@article{priem2022openalex,
  title={OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts},
  author={Priem, Jason and Piwowar, Heather and Orr, Richard},
  journal={arXiv preprint arXiv:2205.01833}, year={2022}
}
@misc{zheng2025miriad,
  title={MIRIAD: Augmenting LLMs with millions of medical query-response pairs},
  author={Qinyue Zheng and Salman Abdullah and Sam Rawal and Cyril Zakka and Sophie Ostmeier
          and Maximilian Purk and Eduardo Reis and Eric J. Topol and Jure Leskovec and Michael Moor},
  eprint={2506.06091}, archivePrefix={arXiv}, year={2025}
}
@misc{gururajan2024aloe,
  title={Aloe: A Family of Fine-tuned Open Healthcare LLMs},
  author={Ashwin Kumar Gururajan and Enrique Lopez-Cuena and Jordi Bayarri-Planas and Adrian Tormos
          and Daniel Hinjos and Pablo Bernabeu-Perez and Anna Arias-Duart and Pablo Agustin Martin-Torres
          and Lucia Urcelay-Ganzabal and Marta Gonzalez-Mallo and Sergio Alvarez-Napagao
          and Eduard Ayguade-Parra and Ulises Cortes and Dario Garcia-Gasulla},
  eprint={2405.01886}, archivePrefix={arXiv}, year={2024}
}
@misc{chen2024huatuogpto1,
  title={HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs},
  author={Junying Chen and Zhenyang Cai and Ke Ji and Xidong Wang and Wanlong Liu and Rongsheng Wang
          and Jianye Hou and Benyou Wang},
  eprint={2412.18925}, archivePrefix={arXiv}, year={2024}
}

Acknowledgements

Built on Llama-3.1 (Meta), Unsloth, Hugging Face TRL/PEFT/Transformers, and bitsandbytes. We thank all dataset creators listed above for making their data publicly available. For research and clinical-decision-support only.