--- license: llama3.1 base_model: unsloth/Meta-Llama-3.1-8B language: - en library_name: transformers pipeline_tag: text-generation tags: - dental - medical - healthcare - clinical - llama-3.1 - cpt - sft - dpo - qlora - unsloth - pall --- # PALL-Text — A Dental-Domain Llama-3.1-8B **PALL-Text** is a dental-domain–specialized large language model, adapted from **Llama-3.1-8B** through a three-stage post-training pipeline — **Continued Pre-Training (CPT) → Supervised Fine-Tuning (SFT) → Direct Preference Optimization (DPO)** — run end-to-end under 4-bit QLoRA on a **single A100-40GB GPU** for roughly **\$20** of cloud compute. This repository hosts the **final, fully-merged bf16 model** (all three adapters merged into the base weights — no PEFT adapter required at inference). - **Developed by:** Harisundar R - **Base model:** [`unsloth/Meta-Llama-3.1-8B`](https://huggingface.co/unsloth/Meta-Llama-3.1-8B) - **Code:** [PALL on GitHub](https://github.com/HARISUNDARRAJENDRAN/PALL) - **Companion VLM:** [`Harisundar/PALL-VLM`](https://huggingface.co/Harisundar/PALL-VLM) - **Training data:** [`Harisundar/pall`](https://huggingface.co/datasets/Harisundar/pall) - **Language:** English - **License:** Llama 3.1 Community License --- ## Model description Frontier LLMs remain clinically uneven on dental tasks, while specialized dental models are typically closed-weight and require multi-GPU clusters. PALL closes this gap with an open, single-GPU, reproducible recipe. **The contribution is integration, not a new algorithm:** established parameter-efficient techniques combined into one affordable pipeline and applied to dentistry — including preference tuning for clinical safety, which the dental-LLM literature otherwise lacks. ### Architecture - `LlamaForCausalLM`, 8.03B parameters, bf16, 32 layers, hidden size 4096, vocab 128,256. - Grouped-Query Attention (32 query / 8 KV heads), RoPE, RMSNorm, SwiGLU MLP. ### Training pipeline | Stage | Objective | Data | Key config | Eval | |-------|-----------|------|-----------|------| | **CPT** | inject dental knowledge | ~175M-token dental corpus | r=64, α=128, lr 2e-4, 1 epoch | eval loss 1.684 (ppl ≈ 5.39) | | **SFT** | instruction following | ~392K Q&A pairs (loss on assistant tokens) | lr 1e-4, 2 epochs, eff. batch 48 | eval loss 1.292 | | **DPO** | safety / preference alignment | ~10.7K preference triplets | β=0.1, lr 2e-6, 1 epoch | eval loss 0.0138, **99.5% pref. acc.** | ### Efficiency stack **QLoRA** (4-bit NF4 + LoRA r=64/α=128 on all 7 projections), **FlashAttention-2**, **Unsloth** fused kernels, and **paged AdamW 8-bit** — all in bf16 with gradient checkpointing. Peak VRAM stays under 40 GB throughout. --- ## Results Held-out dental benchmark (250 MCQ · 250 oral-disease open-QA · 500 dental-forum open-QA): | Stage | Dental MCQ | Note | |-------|:----------:|------| | Baseline Llama-3.1-8B | 56.0% | — | | + CPT | 4.0% | format collapse (knowledge gained, not lost) | | + SFT | **58.0%** | best MCQ — knowledge becomes accessible | | + DPO (this model) | 48.8% | trades exam rigidity for open-ended quality | DPO's gains land on the deployment-relevant axis (oral-disease open-QA, 1–5 judge scale): correctness **3.78 → 4.61**, clarity **3.84 → 4.78**, hedging/safety **4.82 → 4.97**, and the failure rate (runaway generation / meta-narration) drops from ~100% to **16%**. --- ## Intended use - **Intended:** dental education, clinical-knowledge Q&A, patient-communication drafting, and as an open backbone for further dental fine-tuning or the multimodal [PALL-VLM](https://huggingface.co/Harisundar/PALL-VLM). - **Out of scope:** autonomous diagnosis or treatment decisions; non-dental medical advice; any use without qualified clinician oversight. ## Usage ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "Harisundar/PALL-Text" tok = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto") messages = [ {"role": "system", "content": "You are a careful dental clinical assistant."}, {"role": "user", "content": "What is the recommended management for irreversible pulpitis?"}, ] inputs = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device) out = model.generate(inputs, max_new_tokens=400, do_sample=False) print(tok.decode(out[0][inputs.shape[1]:], skip_special_tokens=True)) ``` --- ## Training Data Sources & Acknowledgements PALL is trained on data assembled from many publicly available sources. We gratefully acknowledge the creators of these datasets. Row counts are from our dental-filtered subsets; the original datasets may be larger. ### CPT — Continued Pre-Training corpus (~406K documents, ~175M tokens) | Source | Rows | Attribution | |--------|-----:|-------------| | [OpenAlex](https://openalex.org/) dental works | 199,518 | Priem, J., Piwowar, H., & Orr, R. (2022). *OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts.* arXiv:2205.01833 | | [PubMed](https://pubmed.ncbi.nlm.nih.gov/) dental abstracts | 95,436 | U.S. National Library of Medicine / NCBI | | [PMC](https://www.ncbi.nlm.nih.gov/pmc/) open-access dental full text | 49,895 | U.S. National Library of Medicine / NCBI | | Dental textbooks (cleaned) | 32,923 | Various authors (see DPO sources below) | | [ClinicalTrials.gov](https://clinicaltrials.gov/) dental studies | 15,379 | U.S. National Library of Medicine | | [Wikipedia](https://www.wikipedia.org/) dental articles | 11,634 | Wikimedia Foundation (CC BY-SA) | | HuggingFace dental extras | 1,098 | Community datasets | ### SFT — Supervised Fine-Tuning (~412K instruction pairs) | Source | Rows | Attribution | |--------|-----:|-------------| | [`OpenMed/Medical-Reasoning-SFT-Mega`](https://huggingface.co/datasets/OpenMed/Medical-Reasoning-SFT-Mega) | 24,550 | OpenMed team | | PubMedQA dental (artificial subset) | 20,662 | Jin, Q., Dhingra, B., Liu, Z., Cohen, W.W., & Lu, X. (2019). *PubMedQA: A Dataset for Biomedical Research Question Answering.* EMNLP 2019 | | [`ibivibiv/medical_instruct_en`](https://huggingface.co/datasets/ibivibiv/medical_instruct_en) | 19,336 | ibivibiv (HuggingFace) | | [`FreedomIntelligence/ApolloCorpus`](https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus) | 18,016 | Wang, X. et al. (2024). *Apollo: A Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People.* arXiv:2403.03640 | | ChatDoctor HealthcareMagic dental | 12,423 | Li, Y. et al. (2023). *ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge.* Cureus 15(6) | | [`exafluence/Open-MedQA-Nexus`](https://huggingface.co/datasets/exafluence/Open-MedQA-Nexus) | 11,608 | Exafluence team | | [`accolade2025/dental-llama2-35k`](https://huggingface.co/datasets/accolade2025/dental-llama2-35k) | 10,197 | accolade2025 (HuggingFace) | | [`electricsheepafrica/oral-health-dental-disease`](https://huggingface.co/datasets/electricsheepafrica/oral-health-dental-disease) | 10,000 | electricsheepafrica (HuggingFace) | | [`Intelligent-Internet/II-Medical-Reasoning-SFT`](https://huggingface.co/datasets/Intelligent-Internet/II-Medical-Reasoning-SFT) | 9,738 | Intelligent Internet team | | [`ruslanmv/ai-medical-chatbot`](https://huggingface.co/datasets/ruslanmv/ai-medical-chatbot) | 9,681 | ruslanmv (HuggingFace) | | [`AGBonnet/augmented-clinical-notes`](https://huggingface.co/datasets/AGBonnet/augmented-clinical-notes) | 16,265 | AGBonnet (HuggingFace) | | [`FremyCompany/Asclepius-Synthetic-Clinical-Notes-QA-EN`](https://huggingface.co/datasets/FremyCompany/Asclepius-Synthetic-Clinical-Notes-QA-EN) | 8,138 | Kweon, S. et al. (2023). *Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes.* arXiv:2309.00237 | | MedMCQA dental subset | 6,315 | Pal, A., Umapathi, L.K., & Sankarasubbu, M. (2022). *MedMCQA: A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering.* CHIL 2022 | | [`Mxode/Chinese-Medical-Instruct-1M`](https://huggingface.co/datasets/Mxode/Chinese-Medical-Instruct-1M) | 6,528 | Mxode (HuggingFace) | | [`miriad/miriad-4.4M`](https://huggingface.co/datasets/miriad/miriad-4.4M) | 5,995 | Zheng, Q. et al. (2025). *MIRIAD: Augmenting LLMs with millions of medical query-response pairs.* arXiv:2506.06091 | | HuatuoGPT2-SFT (GPT-4 generated) | 4,977 | Chen, J. et al. (2023). *HuatuoGPT-II: One-stage Training for Medical Adaption of LLMs.* arXiv:2311.09774 | | [`BrainHealthAI/MedQA_mutilangual`](https://huggingface.co/datasets/BrainHealthAI/MedQA_mutilangual) | 3,970 | BrainHealthAI (HuggingFace); originally Jin, D. et al. (2021). *What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams.* Applied Sciences | | Dental forum / Jonathan Kang dental | 2,419 | Community dental forums | | [`Tuminha/dental-evidence-dataset`](https://huggingface.co/datasets/Tuminha/dental-evidence-dataset) | 1,931 | Tuminha (HuggingFace) | | [`FreedomIntelligence/medical-o1-reasoning-SFT`](https://huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT) | 2,910 | FreedomIntelligence team | | [`naazimsnh02/dentalgemma-instruct`](https://huggingface.co/datasets/naazimsnh02/dentalgemma-instruct) | 1,743 | naazimsnh02 (HuggingFace) | | [`AnonymousSub/MedQuAD_47441_Question_Answer_Pairs`](https://huggingface.co/datasets/AnonymousSub/MedQuAD_47441_Question_Answer_Pairs) | 1,734 | Ben Abacha, A. & Demner-Fushman, D. (2019). *A Question-Entailment Approach to Question Answering.* BMC Bioinformatics | | [`HPAI-BSC/Aloe-Beta-Medical-Collection`](https://huggingface.co/datasets/HPAI-BSC/Aloe-Beta-Medical-Collection) | 1,633 | Gururajan, A.K. et al. (2024). *Aloe: A Family of Fine-tuned Open Healthcare LLMs.* arXiv:2405.01886 | | [`saidonepudi8/dental_tc`](https://huggingface.co/datasets/saidonepudi8/dental_tc) | 1,149 | saidonepudi8 (HuggingFace) | | [`lavita/AlpaCare-MedInstruct-52k`](https://huggingface.co/datasets/lavita/AlpaCare-MedInstruct-52k) | 1,063 | Zhang, X. et al. (2023). *AlpaCare: Instruction-tuned Large Language Models for Medical Application.* arXiv:2310.14558 | | Textbook-derived SFT | 2,039 | Various dental textbook authors | ### DPO — Direct Preference Optimization (~10.7K preference triplets) DPO preference pairs were constructed from dental textbook content by scoring candidate responses on a 5-axis rubric (correctness, conciseness, hedging, clarity, safety) and pairing top- with bottom-quartile responses. Source textbooks include works by: - Hargreaves, K.M. — *Cohen's Pathways of the Pulp* (9th Ed.) - Humes, H.D. — *Kelley's Textbook of Internal Medicine* (4th Ed.) - Harvey, A. — *Lippincott's Illustrated Biochemistry* (5th Ed.) - Myers, J. — *Oral Cancer Metastasis* - Koolman, J. — *Color Atlas of Biochemistry* (2nd Ed.) - Fonseca, R.J. — *Oral and Maxillofacial Surgery* (Vol I) - Mehta, N.R. — *Head, Face, and Neck Pain* - Fejerskov, O. & Kidd, E. — *Dental Caries: The Disease and its Clinical Management* (2nd Ed.) - Mitchell, D.A. — *Oxford Handbook of Clinical Dentistry* (6th Ed.) - Regezi, J.A. — *Oral Pathology* (7th Ed.) - Malamed, S.F. — *Sedation: A Guide to Patient Management* (5th Ed.) - And 100+ additional dental and medical textbooks --- ## Limitations & risks - May hallucinate or omit; **not** a substitute for professional clinical judgment. - Strict multiple-choice accuracy is slightly below the SFT checkpoint by design (DPO favors open-ended clinical reasoning). - Conciseness on long, ambiguous patient narratives remains the weakest dimension. - Not evaluated for long-horizon treatment planning, rare pathologies, or adversarial cases. --- ## Citation If you use PALL, please cite: ```bibtex @misc{rajendran2026pall, title = {PALL: Post-training Adaptation of Large Language Models at Low Cost for Dental-Domain Specialization}, author = {Rajendran, Harisundar}, year = {2026}, howpublished = {\url{https://huggingface.co/Harisundar/PALL-Text}}, } ``` ### Foundational works ```bibtex @inproceedings{hu2022lora, title={LoRA: Low-Rank Adaptation of Large Language Models}, author={Hu, Edward J. and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu}, booktitle={ICLR}, year={2022} } @inproceedings{dettmers2023qlora, title={QLoRA: Efficient Finetuning of Quantized LLMs}, author={Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke}, booktitle={NeurIPS}, year={2023} } @inproceedings{rafailov2023dpo, title={Direct Preference Optimization: Your Language Model is Secretly a Reward Model}, author={Rafailov, Rafael and Sharma, Archit and Mitchell, Eric and Ermon, Stefano and Manning, Christopher D. and Finn, Chelsea}, booktitle={NeurIPS}, year={2023} } @article{dao2023flashattention2, title={FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning}, author={Dao, Tri}, journal={arXiv:2307.08691}, year={2023} } @misc{unsloth2024, title={Unsloth: 2x faster, 70\% less memory LLM finetuning}, author={Han, Daniel and Han, Michael and {Unsloth team}}, year={2024}, howpublished={\url{https://github.com/unslothai/unsloth}} } @article{grattafiori2024llama3, title={The Llama 3 Herd of Models}, author={Grattafiori, Aaron and others}, journal={arXiv:2407.21783}, year={2024} } ``` ### Key dataset citations ```bibtex @inproceedings{jin2019pubmedqa, title={PubMedQA: A Dataset for Biomedical Research Question Answering}, author={Jin, Qiao and Dhingra, Bhuwan and Liu, Zhengping and Cohen, William and Lu, Xinghua}, booktitle={EMNLP-IJCNLP}, pages={2567--2577}, year={2019} } @InProceedings{pal2022medmcqa, title={MedMCQA: A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering}, author={Pal, Ankit and Umapathi, Logesh Kumar and Sankarasubbu, Malaikannan}, booktitle={CHIL}, series={PMLR}, volume={174}, pages={248--260}, year={2022} } @misc{wang2024apollo, title={Apollo: Lightweight Multilingual Medical LLMs towards Democratizing Medical AI to 6B People}, author={Xidong Wang and Nuo Chen and Junyin Chen and Yan Hu and Yidong Wang and Xiangbo Wu and Anningzhe Gao and Xiang Wan and Haizhou Li and Benyou Wang}, eprint={2403.03640}, archivePrefix={arXiv}, year={2024} } @article{li2023chatdoctor, title={ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge}, author={Li, Yunxiang and Li, Zihan and Zhang, Kai and Dan, Ruilong and Jiang, Steve and Zhang, You}, journal={Cureus}, volume={15}, number={6}, year={2023}, doi={10.7759/cureus.40895} } @article{chen2023huatuogpt2, title={HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs}, author={Chen, Junying and Wang, Xidong and Gao, Anningzhe and Jiang, Feng and Chen, Shunian and Zhang, Hongbo and Song, Dingjie and Xie, Wenya and Kong, Chuyi and Li, Jianquan and Wan, Xiang and Li, Haizhou and Wang, Benyou}, journal={arXiv preprint arXiv:2311.09774}, year={2023} } @misc{kweon2023asclepius, title={Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes}, author={Sunjun Kweon and Junu Kim and Jiyoun Kim and Sujeong Im and Eunbyeol Cho and Seongsu Bae and Jungwoo Oh and Gyubok Lee and Jong Hak Moon and Seng Chan You and Seungjin Baek and Chang Hoon Han and Yoon Bin Jung and Yohan Jo and Edward Choi}, eprint={2309.00237}, archivePrefix={arXiv}, year={2023} } @misc{zhang2023alpacare, title={AlpaCare: Instruction-tuned Large Language Models for Medical Application}, author={Xinlu Zhang and Chenxin Tian and Xianjun Yang and Lichang Chen and Zekun Li and Linda Ruth Petzold}, eprint={2310.14558}, archivePrefix={arXiv}, year={2023} } @article{benabacha2019medquad, title={A Question-Entailment Approach to Question Answering}, author={Ben Abacha, Asma and Demner-Fushman, Dina}, journal={BMC Bioinformatics}, volume={20}, number={1}, pages={511:1--511:23}, year={2019} } @article{jin2021medqa, title={What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams}, author={Jin, Di and Pan, Eileen and Oufattole, Nassim and Weng, Wei-Hung and Fang, Hanyi and Szolovits, Peter}, journal={Applied Sciences}, volume={11}, number={14}, pages={6421}, year={2021}, doi={10.3390/app11146421} } @article{priem2022openalex, title={OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts}, author={Priem, Jason and Piwowar, Heather and Orr, Richard}, journal={arXiv preprint arXiv:2205.01833}, year={2022} } @misc{zheng2025miriad, title={MIRIAD: Augmenting LLMs with millions of medical query-response pairs}, author={Qinyue Zheng and Salman Abdullah and Sam Rawal and Cyril Zakka and Sophie Ostmeier and Maximilian Purk and Eduardo Reis and Eric J. Topol and Jure Leskovec and Michael Moor}, eprint={2506.06091}, archivePrefix={arXiv}, year={2025} } @misc{gururajan2024aloe, title={Aloe: A Family of Fine-tuned Open Healthcare LLMs}, author={Ashwin Kumar Gururajan and Enrique Lopez-Cuena and Jordi Bayarri-Planas and Adrian Tormos and Daniel Hinjos and Pablo Bernabeu-Perez and Anna Arias-Duart and Pablo Agustin Martin-Torres and Lucia Urcelay-Ganzabal and Marta Gonzalez-Mallo and Sergio Alvarez-Napagao and Eduard Ayguade-Parra and Ulises Cortes and Dario Garcia-Gasulla}, eprint={2405.01886}, archivePrefix={arXiv}, year={2024} } @misc{chen2024huatuogpto1, title={HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs}, author={Junying Chen and Zhenyang Cai and Ke Ji and Xidong Wang and Wanlong Liu and Rongsheng Wang and Jianye Hou and Benyou Wang}, eprint={2412.18925}, archivePrefix={arXiv}, year={2024} } ``` ## Acknowledgements Built on Llama-3.1 (Meta), Unsloth, Hugging Face TRL/PEFT/Transformers, and bitsandbytes. We thank all dataset creators listed above for making their data publicly available. For research and clinical-decision-**support** only.