|
|
--- |
|
|
library_name: peft |
|
|
base_model: microsoft/phi-2 |
|
|
tags: |
|
|
- biomedical |
|
|
- summarization |
|
|
- lay-summary |
|
|
- healthcare |
|
|
- nlp |
|
|
- fine-tuned |
|
|
- lora |
|
|
- peft |
|
|
- elife |
|
|
- plos |
|
|
- medical-text |
|
|
language: |
|
|
- en |
|
|
license: mit |
|
|
metrics: |
|
|
- rouge |
|
|
- bertscore |
|
|
- readability |
|
|
datasets: |
|
|
- elife |
|
|
- plos |
|
|
pipeline_tag: text2text-generation |
|
|
--- |
|
|
|
|
|
# Phi-2 BioLaySum: Biomedical Lay Summarization Model π |
|
|
|
|
|
## π Model Overview |
|
|
|
|
|
**Phi-2 BioLaySum** is a **champion model** that emerged as the most efficient and highest-performing solution for generating lay summaries of biomedical articles. This model converts complex medical research into easily understandable summaries for the general public, significantly enhancing accessibility to scientific literature. |
|
|
|
|
|
**π₯ Key Achievement**: This model **outperformed** T5-Base, T5-Large, FlanT5-Base, BioGPT, and Falconsi-Medical_summarisation across all evaluation dimensions (relevance, readability, and factuality) while maintaining optimal computational efficiency. |
|
|
|
|
|
## π― Model Purpose |
|
|
|
|
|
This model addresses the critical need to bridge the gap between complex biomedical research and public health literacy by: |
|
|
- Converting medical articles into patient-friendly summaries |
|
|
- Supporting healthcare communication between professionals and patients |
|
|
- Enhancing public access to biomedical research findings |
|
|
- Enabling better-informed health decisions by the general public |
|
|
|
|
|
## ποΈ Model Architecture |
|
|
|
|
|
- **Base Model**: microsoft/phi-2 |
|
|
- **Fine-tuning Technique**: LoRA (Low-Rank Adaptation) + PEFT (Parameter Efficient Fine-tuning) |
|
|
- **Model Type**: Text-to-Text Generation (Summarization) |
|
|
- **Language**: English |
|
|
- **Domain**: Biomedical/Healthcare |
|
|
|
|
|
## π Performance Highlights |
|
|
|
|
|
### Why Phi-2 is the Champion Model: |
|
|
- β
**Superior Performance**: Best scores across relevance, readability, and factuality metrics |
|
|
- β
**Resource Efficiency**: Optimal performance-to-resource ratio |
|
|
- β
**Compact Size**: Most efficient in terms of model size and computational requirements |
|
|
- β
**Cost-Effective**: Best balance of quality and computational cost |
|
|
|
|
|
### Evaluation Results: |
|
|
- **Relevance**: Measured using ROUGE (1, 2, L) and BERTScore |
|
|
- **Readability**: Assessed via Flesch-Kincaid Grade Level (FKGL) and Dale-Chall Readability Score (DCRS) |
|
|
- **Factuality**: Verified using BARTScore and factual consistency checks |
|
|
|
|
|
## π Quick Start |
|
|
|
|
|
### Loading the Model |
|
|
|
|
|
```python |
|
|
from peft import PeftModel, PeftConfig |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
# Load the base model and tokenizer |
|
|
base_model_name = "microsoft/phi-2" |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
base_model_name, |
|
|
torch_dtype=torch.float16, |
|
|
device_map="auto" |
|
|
) |
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model_name) |
|
|
|
|
|
# Load the fine-tuned adapter |
|
|
model = PeftModel.from_pretrained(model, "sank29mane/phi-2-biolaysum") |
|
|
|
|
|
# Set padding token |
|
|
if tokenizer.pad_token is None: |
|
|
tokenizer.pad_token = tokenizer.eos_token |
|
|
``` |
|
|
|
|
|
### Generating Lay Summaries |
|
|
|
|
|
```python |
|
|
def generate_lay_summary(medical_text, max_length=150): |
|
|
# Prepare input |
|
|
prompt = f"Summarize the following medical text for a general audience: {medical_text}" |
|
|
inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=512) |
|
|
|
|
|
# Generate summary |
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_length=max_length, |
|
|
temperature=0.7, |
|
|
do_sample=True, |
|
|
pad_token_id=tokenizer.eos_token_id |
|
|
) |
|
|
|
|
|
# Decode and return |
|
|
summary = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
return summary.split(":")[-1].strip() # Extract generated part |
|
|
|
|
|
# Example usage |
|
|
medical_text = """ |
|
|
The study investigated the efficacy of novel therapeutic interventions |
|
|
in cardiovascular disease management through randomized controlled trials... |
|
|
""" |
|
|
|
|
|
lay_summary = generate_lay_summary(medical_text) |
|
|
print(f"Lay Summary: {lay_summary}") |
|
|
``` |
|
|
|
|
|
## π Training Details |
|
|
|
|
|
### Training Data |
|
|
- **eLife Dataset**: Open-access biomedical research articles with lay summaries |
|
|
- **PLOS Dataset**: Public Library of Science biomedical publications |
|
|
- **Data Processing**: Advanced preprocessing for optimal model performance |
|
|
|
|
|
### Training Configuration |
|
|
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation) with PEFT |
|
|
- **Base Model**: microsoft/phi-2 |
|
|
- **Training Framework**: PyTorch + Hugging Face Transformers |
|
|
- **Optimization**: Parameter-efficient approach reducing computational requirements |
|
|
|
|
|
### Training Advantages |
|
|
- **Efficiency**: LoRA reduces trainable parameters while maintaining performance |
|
|
- **Resource-Friendly**: PEFT enables high-quality fine-tuning with limited resources |
|
|
- **Stability**: Advanced techniques ensure robust model behavior |
|
|
|
|
|
## π Comparative Analysis |
|
|
|
|
|
### Models Compared: |
|
|
1. **T5-Base** - Text-to-Text Transfer Transformer (Base) |
|
|
2. **T5-Large** - Text-to-Text Transfer Transformer (Large) |
|
|
3. **FlanT5-Base** - Instruction-tuned T5 model |
|
|
4. **BioGPT** - Biomedical domain-specific GPT |
|
|
5. **Phi-2** - Microsoft's efficient language model (**Winner**) |
|
|
6. **Falconsi-Medical_summarisation** - Specialized medical summarization model |
|
|
|
|
|
### Key Findings: |
|
|
- **Phi-2 outperformed all competitors** in comprehensive evaluation |
|
|
- **Domain-specific models** (BioGPT, Falconsi) showed advantages over general T5 models |
|
|
- **Parameter efficiency** of Phi-2 provided superior cost-effectiveness |
|
|
- **Smaller models** can achieve better performance with proper fine-tuning |
|
|
|
|
|
## π― Use Cases |
|
|
|
|
|
### Healthcare Applications: |
|
|
- **Patient Education**: Convert research findings into understandable format |
|
|
- **Medical Communication**: Support doctor-patient conversations |
|
|
- **Health Journalism**: Assist science writers and health reporters |
|
|
- **Educational Materials**: Create teaching resources for health education |
|
|
- **Policy Support**: Provide accessible summaries for health policy decisions |
|
|
|
|
|
### Target Audiences: |
|
|
- Healthcare professionals seeking patient communication tools |
|
|
- Patients and families researching medical conditions |
|
|
- Health educators and trainers |
|
|
- Medical journalists and science communicators |
|
|
- Public health policy makers |
|
|
|
|
|
## β‘ Performance Metrics |
|
|
|
|
|
### Evaluation Framework: |
|
|
- **ROUGE Scores**: Overlap-based relevance assessment |
|
|
- **BERTScore**: Semantic similarity evaluation |
|
|
- **Readability Metrics**: FKGL and DCRS for accessibility |
|
|
- **Factual Consistency**: BARTScore for accuracy verification |
|
|
|
|
|
### Resource Efficiency: |
|
|
- **Model Size**: Compact and deployment-friendly |
|
|
- **Inference Speed**: Fast generation suitable for real-time applications |
|
|
- **Memory Usage**: Optimized for various computational environments |
|
|
- **Cost Effectiveness**: Best performance per computational dollar |
|
|
|
|
|
## π§ Technical Specifications |
|
|
|
|
|
### Model Details: |
|
|
- **Architecture**: Transformer-based with LoRA adaptation |
|
|
- **Parameters**: Base Phi-2 + efficient LoRA adapters |
|
|
- **Precision**: Mixed precision training for efficiency |
|
|
- **Framework**: PyTorch with Hugging Face ecosystem |
|
|
|
|
|
### System Requirements: |
|
|
- **Minimum GPU**: 4GB VRAM for inference |
|
|
- **Recommended**: 8GB+ VRAM for optimal performance |
|
|
- **CPU**: Compatible with CPU inference (slower) |
|
|
- **Dependencies**: transformers, peft, torch |
|
|
|
|
|
## π Research Impact |
|
|
|
|
|
This model contributes to: |
|
|
- **Democratizing Medical Knowledge**: Making research accessible to all |
|
|
- **Advancing Healthcare NLP**: Pushing boundaries of medical text processing |
|
|
- **Resource-Efficient AI**: Demonstrating effective use of LoRA and PEFT |
|
|
- **Evaluation Methodology**: Comprehensive framework for summarization assessment |
|
|
|
|
|
## π License & Citation |
|
|
|
|
|
### License |
|
|
This model is released under the **MIT License**, promoting open research and development. |
|
|
|
|
|
### Citation |
|
|
If you use this model in your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{mane2024phi2biolaysum, |
|
|
title={Phi-2 BioLaySum: Resource-Efficient Biomedical Lay Summarization using LoRA and PEFT}, |
|
|
author={Mane, Sanket}, |
|
|
year={2024}, |
|
|
publisher={Hugging Face}, |
|
|
url={https://huggingface.co/sank29mane/phi-2-biolaysum} |
|
|
} |
|
|
``` |
|
|
|
|
|
## π Related Resources |
|
|
|
|
|
- **GitHub Repository**: [lays-bio-summery](https://github.com/sank29mane/lays-bio-summery) - Complete training code and evaluation |
|
|
- **Base Model**: [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) |
|
|
- **Research Paper**: [Detailed methodology and results](https://github.com/sank29mane/lays-bio-summery) |
|
|
|
|
|
## π¨βπ» Author |
|
|
|
|
|
**Sanket Mane** - [@sank29mane](https://github.com/sank29mane) |
|
|
*Researcher in Biomedical NLP and Efficient Language Models* |
|
|
|
|
|
## π Contact & Support |
|
|
|
|
|
- **GitHub Issues**: [Create an issue](https://github.com/sank29mane/lays-bio-summery/issues) |
|
|
- **Model Issues**: Use the Community tab above |
|
|
- **Research Collaborations**: Through GitHub profile |
|
|
|
|
|
## π¨ Limitations & Considerations |
|
|
|
|
|
### Current Limitations: |
|
|
- **Language**: Currently optimized for English biomedical text |
|
|
- **Domain**: Focused on general biomedical research (not clinical notes) |
|
|
- **Length**: Optimized for article-length inputs, may vary with very short/long texts |
|
|
|
|
|
### Recommended Use: |
|
|
- Use for biomedical research article summarization |
|
|
- Validate outputs for critical healthcare decisions |
|
|
- Consider human review for patient-facing applications |
|
|
|
|
|
## π Model Updates |
|
|
|
|
|
- **v1.0**: Initial release with LoRA+PEFT fine-tuning |
|
|
- **Future**: Planned improvements for multi-language support and clinical text adaptation |
|
|
|
|
|
--- |
|
|
|
|
|
### Framework Versions |
|
|
- **PEFT**: 0.7.2.dev0 |
|
|
- **Transformers**: Compatible with latest versions |
|
|
- **PyTorch**: 1.12+ |
|
|
|
|
|
β **Star this model if you find it useful for your biomedical NLP research!** β |
|
|
|