Instructions to use Harisundar/PALL-Text with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Harisundar/PALL-Text with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Harisundar/PALL-Text")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("Harisundar/PALL-Text") model = AutoModelForMultimodalLM.from_pretrained("Harisundar/PALL-Text") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Harisundar/PALL-Text with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Harisundar/PALL-Text" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Harisundar/PALL-Text", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Harisundar/PALL-Text
- SGLang
How to use Harisundar/PALL-Text with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Harisundar/PALL-Text" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Harisundar/PALL-Text", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Harisundar/PALL-Text" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Harisundar/PALL-Text", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Unsloth Studio
How to use Harisundar/PALL-Text with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Harisundar/PALL-Text to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Harisundar/PALL-Text to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Harisundar/PALL-Text to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Harisundar/PALL-Text", max_seq_length=2048, ) - Docker Model Runner
How to use Harisundar/PALL-Text with Docker Model Runner:
docker model run hf.co/Harisundar/PALL-Text
PALL-Text — A Dental-Domain Llama-3.1-8B
PALL-Text is a dental-domain–specialized large language model, adapted from Llama-3.1-8B through a three-stage post-training pipeline — Continued Pre-Training (CPT) → Supervised Fine-Tuning (SFT) → Direct Preference Optimization (DPO) — run end-to-end under 4-bit QLoRA on a single A100-40GB GPU for roughly $20 of cloud compute.
This repository hosts the final, fully-merged bf16 model (all three adapters merged into the base weights — no PEFT adapter required at inference).
- Developed by: Harisundar R
- Base model:
unsloth/Meta-Llama-3.1-8B - Code: PALL on GitHub
- Companion VLM:
Harisundar/PALL-VLM - Training data:
Harisundar/pall - Language: English
- License: Llama 3.1 Community License
Model description
Frontier LLMs remain clinically uneven on dental tasks, while specialized dental models are typically closed-weight and require multi-GPU clusters. PALL closes this gap with an open, single-GPU, reproducible recipe. The contribution is integration, not a new algorithm: established parameter-efficient techniques combined into one affordable pipeline and applied to dentistry — including preference tuning for clinical safety, which the dental-LLM literature otherwise lacks.
Architecture
LlamaForCausalLM, 8.03B parameters, bf16, 32 layers, hidden size 4096, vocab 128,256.- Grouped-Query Attention (32 query / 8 KV heads), RoPE, RMSNorm, SwiGLU MLP.
Training pipeline
| Stage | Objective | Data | Key config | Eval |
|---|---|---|---|---|
| CPT | inject dental knowledge | ~175M-token dental corpus | r=64, α=128, lr 2e-4, 1 epoch | eval loss 1.684 (ppl ≈ 5.39) |
| SFT | instruction following | ~392K Q&A pairs (loss on assistant tokens) | lr 1e-4, 2 epochs, eff. batch 48 | eval loss 1.292 |
| DPO | safety / preference alignment | ~10.7K preference triplets | β=0.1, lr 2e-6, 1 epoch | eval loss 0.0138, 99.5% pref. acc. |
Efficiency stack
QLoRA (4-bit NF4 + LoRA r=64/α=128 on all 7 projections), FlashAttention-2, Unsloth fused kernels, and paged AdamW 8-bit — all in bf16 with gradient checkpointing. Peak VRAM stays under 40 GB throughout.
Results
Held-out dental benchmark (250 MCQ · 250 oral-disease open-QA · 500 dental-forum open-QA):
| Stage | Dental MCQ | Note |
|---|---|---|
| Baseline Llama-3.1-8B | 56.0% | — |
| + CPT | 4.0% | format collapse (knowledge gained, not lost) |
| + SFT | 58.0% | best MCQ — knowledge becomes accessible |
| + DPO (this model) | 48.8% | trades exam rigidity for open-ended quality |
DPO's gains land on the deployment-relevant axis (oral-disease open-QA, 1–5 judge scale): correctness 3.78 → 4.61, clarity 3.84 → 4.78, hedging/safety 4.82 → 4.97, and the failure rate (runaway generation / meta-narration) drops from ~100% to 16%.
Intended use
- Intended: dental education, clinical-knowledge Q&A, patient-communication drafting, and as an open backbone for further dental fine-tuning or the multimodal PALL-VLM.
- Out of scope: autonomous diagnosis or treatment decisions; non-dental medical advice; any use without qualified clinician oversight.
Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "Harisundar/PALL-Text"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")
messages = [
{"role": "system", "content": "You are a careful dental clinical assistant."},
{"role": "user", "content": "What is the recommended management for irreversible pulpitis?"},
]
inputs = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
out = model.generate(inputs, max_new_tokens=400, do_sample=False)
print(tok.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))
Training Data Sources & Acknowledgements
PALL is trained on data assembled from many publicly available sources. We gratefully acknowledge the creators of these datasets. Row counts are from our dental-filtered subsets; the original datasets may be larger.
CPT — Continued Pre-Training corpus (~406K documents, ~175M tokens)
| Source | Rows | Attribution |
|---|---|---|
| OpenAlex dental works | 199,518 | Priem, J., Piwowar, H., & Orr, R. (2022). OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts. arXiv:2205.01833 |
| PubMed dental abstracts | 95,436 | U.S. National Library of Medicine / NCBI |
| PMC open-access dental full text | 49,895 | U.S. National Library of Medicine / NCBI |
| Dental textbooks (cleaned) | 32,923 | Various authors (see DPO sources below) |
| ClinicalTrials.gov dental studies | 15,379 | U.S. National Library of Medicine |
| Wikipedia dental articles | 11,634 | Wikimedia Foundation (CC BY-SA) |
| HuggingFace dental extras | 1,098 | Community datasets |
SFT — Supervised Fine-Tuning (~412K instruction pairs)
| Source | Rows | Attribution |
|---|---|---|
OpenMed/Medical-Reasoning-SFT-Mega |
24,550 | OpenMed team |
| PubMedQA dental (artificial subset) | 20,662 | Jin, Q., Dhingra, B., Liu, Z., Cohen, W.W., & Lu, X. (2019). PubMedQA: A Dataset for Biomedical Research Question Answering. EMNLP 2019 |
ibivibiv/medical_instruct_en |
19,336 | ibivibiv (HuggingFace) |
FreedomIntelligence/ApolloCorpus |
18,016 | Wang, X. et al. (2024). Apollo: A Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People. arXiv:2403.03640 |
| ChatDoctor HealthcareMagic dental | 12,423 | Li, Y. et al. (2023). ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge. Cureus 15(6) |
exafluence/Open-MedQA-Nexus |
11,608 | Exafluence team |
accolade2025/dental-llama2-35k |
10,197 | accolade2025 (HuggingFace) |
electricsheepafrica/oral-health-dental-disease |
10,000 | electricsheepafrica (HuggingFace) |
Intelligent-Internet/II-Medical-Reasoning-SFT |
9,738 | Intelligent Internet team |
ruslanmv/ai-medical-chatbot |
9,681 | ruslanmv (HuggingFace) |
AGBonnet/augmented-clinical-notes |
16,265 | AGBonnet (HuggingFace) |
FremyCompany/Asclepius-Synthetic-Clinical-Notes-QA-EN |
8,138 | Kweon, S. et al. (2023). Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes. arXiv:2309.00237 |
| MedMCQA dental subset | 6,315 | Pal, A., Umapathi, L.K., & Sankarasubbu, M. (2022). MedMCQA: A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering. CHIL 2022 |
Mxode/Chinese-Medical-Instruct-1M |
6,528 | Mxode (HuggingFace) |
miriad/miriad-4.4M |
5,995 | Zheng, Q. et al. (2025). MIRIAD: Augmenting LLMs with millions of medical query-response pairs. arXiv:2506.06091 |
| HuatuoGPT2-SFT (GPT-4 generated) | 4,977 | Chen, J. et al. (2023). HuatuoGPT-II: One-stage Training for Medical Adaption of LLMs. arXiv:2311.09774 |
BrainHealthAI/MedQA_mutilangual |
3,970 | BrainHealthAI (HuggingFace); originally Jin, D. et al. (2021). What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams. Applied Sciences |
| Dental forum / Jonathan Kang dental | 2,419 | Community dental forums |
Tuminha/dental-evidence-dataset |
1,931 | Tuminha (HuggingFace) |
FreedomIntelligence/medical-o1-reasoning-SFT |
2,910 | FreedomIntelligence team |
naazimsnh02/dentalgemma-instruct |
1,743 | naazimsnh02 (HuggingFace) |
AnonymousSub/MedQuAD_47441_Question_Answer_Pairs |
1,734 | Ben Abacha, A. & Demner-Fushman, D. (2019). A Question-Entailment Approach to Question Answering. BMC Bioinformatics |
HPAI-BSC/Aloe-Beta-Medical-Collection |
1,633 | Gururajan, A.K. et al. (2024). Aloe: A Family of Fine-tuned Open Healthcare LLMs. arXiv:2405.01886 |
saidonepudi8/dental_tc |
1,149 | saidonepudi8 (HuggingFace) |
lavita/AlpaCare-MedInstruct-52k |
1,063 | Zhang, X. et al. (2023). AlpaCare: Instruction-tuned Large Language Models for Medical Application. arXiv:2310.14558 |
| Textbook-derived SFT | 2,039 | Various dental textbook authors |
DPO — Direct Preference Optimization (~10.7K preference triplets)
DPO preference pairs were constructed from dental textbook content by scoring candidate responses on a 5-axis rubric (correctness, conciseness, hedging, clarity, safety) and pairing top- with bottom-quartile responses. Source textbooks include works by:
- Hargreaves, K.M. — Cohen's Pathways of the Pulp (9th Ed.)
- Humes, H.D. — Kelley's Textbook of Internal Medicine (4th Ed.)
- Harvey, A. — Lippincott's Illustrated Biochemistry (5th Ed.)
- Myers, J. — Oral Cancer Metastasis
- Koolman, J. — Color Atlas of Biochemistry (2nd Ed.)
- Fonseca, R.J. — Oral and Maxillofacial Surgery (Vol I)
- Mehta, N.R. — Head, Face, and Neck Pain
- Fejerskov, O. & Kidd, E. — Dental Caries: The Disease and its Clinical Management (2nd Ed.)
- Mitchell, D.A. — Oxford Handbook of Clinical Dentistry (6th Ed.)
- Regezi, J.A. — Oral Pathology (7th Ed.)
- Malamed, S.F. — Sedation: A Guide to Patient Management (5th Ed.)
- And 100+ additional dental and medical textbooks
Limitations & risks
- May hallucinate or omit; not a substitute for professional clinical judgment.
- Strict multiple-choice accuracy is slightly below the SFT checkpoint by design (DPO favors open-ended clinical reasoning).
- Conciseness on long, ambiguous patient narratives remains the weakest dimension.
- Not evaluated for long-horizon treatment planning, rare pathologies, or adversarial cases.
Citation
If you use PALL, please cite:
@misc{rajendran2026pall,
title = {PALL: Post-training Adaptation of Large Language Models at Low Cost
for Dental-Domain Specialization},
author = {Rajendran, Harisundar},
year = {2026},
howpublished = {\url{https://huggingface.co/Harisundar/PALL-Text}},
}
Foundational works
@inproceedings{hu2022lora,
title={LoRA: Low-Rank Adaptation of Large Language Models},
author={Hu, Edward J. and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan
and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu},
booktitle={ICLR}, year={2022}
}
@inproceedings{dettmers2023qlora,
title={QLoRA: Efficient Finetuning of Quantized LLMs},
author={Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke},
booktitle={NeurIPS}, year={2023}
}
@inproceedings{rafailov2023dpo,
title={Direct Preference Optimization: Your Language Model is Secretly a Reward Model},
author={Rafailov, Rafael and Sharma, Archit and Mitchell, Eric and Ermon, Stefano
and Manning, Christopher D. and Finn, Chelsea},
booktitle={NeurIPS}, year={2023}
}
@article{dao2023flashattention2,
title={FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning},
author={Dao, Tri}, journal={arXiv:2307.08691}, year={2023}
}
@misc{unsloth2024,
title={Unsloth: 2x faster, 70\% less memory LLM finetuning},
author={Han, Daniel and Han, Michael and {Unsloth team}}, year={2024},
howpublished={\url{https://github.com/unslothai/unsloth}}
}
@article{grattafiori2024llama3,
title={The Llama 3 Herd of Models},
author={Grattafiori, Aaron and others}, journal={arXiv:2407.21783}, year={2024}
}
Key dataset citations
@inproceedings{jin2019pubmedqa,
title={PubMedQA: A Dataset for Biomedical Research Question Answering},
author={Jin, Qiao and Dhingra, Bhuwan and Liu, Zhengping and Cohen, William and Lu, Xinghua},
booktitle={EMNLP-IJCNLP}, pages={2567--2577}, year={2019}
}
@InProceedings{pal2022medmcqa,
title={MedMCQA: A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering},
author={Pal, Ankit and Umapathi, Logesh Kumar and Sankarasubbu, Malaikannan},
booktitle={CHIL}, series={PMLR}, volume={174}, pages={248--260}, year={2022}
}
@misc{wang2024apollo,
title={Apollo: Lightweight Multilingual Medical LLMs towards Democratizing Medical AI to 6B People},
author={Xidong Wang and Nuo Chen and Junyin Chen and Yan Hu and Yidong Wang and Xiangbo Wu
and Anningzhe Gao and Xiang Wan and Haizhou Li and Benyou Wang},
eprint={2403.03640}, archivePrefix={arXiv}, year={2024}
}
@article{li2023chatdoctor,
title={ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge},
author={Li, Yunxiang and Li, Zihan and Zhang, Kai and Dan, Ruilong and Jiang, Steve and Zhang, You},
journal={Cureus}, volume={15}, number={6}, year={2023}, doi={10.7759/cureus.40895}
}
@article{chen2023huatuogpt2,
title={HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs},
author={Chen, Junying and Wang, Xidong and Gao, Anningzhe and Jiang, Feng and Chen, Shunian
and Zhang, Hongbo and Song, Dingjie and Xie, Wenya and Kong, Chuyi and Li, Jianquan
and Wan, Xiang and Li, Haizhou and Wang, Benyou},
journal={arXiv preprint arXiv:2311.09774}, year={2023}
}
@misc{kweon2023asclepius,
title={Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes},
author={Sunjun Kweon and Junu Kim and Jiyoun Kim and Sujeong Im and Eunbyeol Cho and Seongsu Bae
and Jungwoo Oh and Gyubok Lee and Jong Hak Moon and Seng Chan You and Seungjin Baek
and Chang Hoon Han and Yoon Bin Jung and Yohan Jo and Edward Choi},
eprint={2309.00237}, archivePrefix={arXiv}, year={2023}
}
@misc{zhang2023alpacare,
title={AlpaCare: Instruction-tuned Large Language Models for Medical Application},
author={Xinlu Zhang and Chenxin Tian and Xianjun Yang and Lichang Chen and Zekun Li and Linda Ruth Petzold},
eprint={2310.14558}, archivePrefix={arXiv}, year={2023}
}
@article{benabacha2019medquad,
title={A Question-Entailment Approach to Question Answering},
author={Ben Abacha, Asma and Demner-Fushman, Dina},
journal={BMC Bioinformatics}, volume={20}, number={1}, pages={511:1--511:23}, year={2019}
}
@article{jin2021medqa,
title={What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams},
author={Jin, Di and Pan, Eileen and Oufattole, Nassim and Weng, Wei-Hung and Fang, Hanyi and Szolovits, Peter},
journal={Applied Sciences}, volume={11}, number={14}, pages={6421}, year={2021}, doi={10.3390/app11146421}
}
@article{priem2022openalex,
title={OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts},
author={Priem, Jason and Piwowar, Heather and Orr, Richard},
journal={arXiv preprint arXiv:2205.01833}, year={2022}
}
@misc{zheng2025miriad,
title={MIRIAD: Augmenting LLMs with millions of medical query-response pairs},
author={Qinyue Zheng and Salman Abdullah and Sam Rawal and Cyril Zakka and Sophie Ostmeier
and Maximilian Purk and Eduardo Reis and Eric J. Topol and Jure Leskovec and Michael Moor},
eprint={2506.06091}, archivePrefix={arXiv}, year={2025}
}
@misc{gururajan2024aloe,
title={Aloe: A Family of Fine-tuned Open Healthcare LLMs},
author={Ashwin Kumar Gururajan and Enrique Lopez-Cuena and Jordi Bayarri-Planas and Adrian Tormos
and Daniel Hinjos and Pablo Bernabeu-Perez and Anna Arias-Duart and Pablo Agustin Martin-Torres
and Lucia Urcelay-Ganzabal and Marta Gonzalez-Mallo and Sergio Alvarez-Napagao
and Eduard Ayguade-Parra and Ulises Cortes and Dario Garcia-Gasulla},
eprint={2405.01886}, archivePrefix={arXiv}, year={2024}
}
@misc{chen2024huatuogpto1,
title={HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs},
author={Junying Chen and Zhenyang Cai and Ke Ji and Xidong Wang and Wanlong Liu and Rongsheng Wang
and Jianye Hou and Benyou Wang},
eprint={2412.18925}, archivePrefix={arXiv}, year={2024}
}
Acknowledgements
Built on Llama-3.1 (Meta), Unsloth, Hugging Face TRL/PEFT/Transformers, and bitsandbytes. We thank all dataset creators listed above for making their data publicly available. For research and clinical-decision-support only.
- Downloads last month
- 36
docker model run hf.co/Harisundar/PALL-Text