π©Ί MediGPT: A Domain-Specific Clinical Small Language Model
MediGPT is a decoder-only GPT-style Transformer trained entirely from scratch on medical and biomedical text corpora.
Unlike models that rely on large-scale pretraining followed by fine-tuning, MediGPT was developed as a research-oriented educational project to explore domain-specific language modeling, transformer architectures, and clinical text generation.
π Key Features
- Built entirely from scratch using PyTorch
- Decoder-only GPT architecture
- Trained on medical and biomedical datasets
- GPT-2 BPE tokenization via tiktoken
- End-to-end implementation including:
- Data preprocessing
- Corpus analysis
- Transformer implementation
- Training pipeline
- Evaluation metrics
- Text generation
ποΈ Model Architecture
| Component | Value |
|---|---|
| Architecture | GPT Decoder |
| Layers | 8 |
| Attention Heads | 8 |
| Hidden Size | 512 |
| Context Length | 512 |
| Dropout | 0.1 |
| Tokenizer | GPT-2 (tiktoken) |
| Framework | PyTorch |
Approximate Parameter Count: ~90M
π Training Data
MediGPT was trained on a combination of medical datasets:
MedQuAD
Medical question-answer pairs covering diseases, symptoms, diagnostics, treatments, and healthcare information.
PubMedQA
Biomedical question-answer data derived from scientific literature and PubMed abstracts.
The resulting corpus exposes the model to:
- Clinical terminology
- Biomedical vocabulary
- Medical QA patterns
- Research-style writing
- Healthcare discourse
π― Training Objective
The model was trained using autoregressive next-token prediction.
Given a sequence such as:
Symptoms of diabetes include ...
the model learns to predict the most likely next token.
π Evaluation
Evaluation included:
- Validation Loss
- Perplexity
- Corpus Statistics
- Attention Analysis
- Qualitative Text Generation
Results indicate successful learning of:
- Clinical vocabulary
- Biomedical terminology
- Medical writing style
- Research-oriented sentence structures
π¬ Example Prompt
Input:
Symptoms of diabetes include
Generated continuation:
Symptoms of diabetes include increased thirst, frequent urination, fatigue, blurred vision and other complications associated with impaired glucose regulation.
β οΈ Limitations
- Small-scale model compared to modern foundation models
- Limited medical reasoning capability
- May generate inaccurate information
- Not suitable for clinical deployment
- Intended primarily for research and educational purposes
π©Ί Medical Disclaimer
This model is not a medical device and must not be used for:
- Medical diagnosis
- Treatment recommendations
- Clinical decision making
- Emergency healthcare situations
All outputs should be verified by qualified healthcare professionals.
π¨βπ» Author
Rishabh Shenoy
Developed as a research-oriented educational project exploring domain-specific language model development and biomedical NLP.
π Citation
If you use this work, please cite the repository and model page.
- Downloads last month
- 25






