|
|
--- |
|
|
language: "en" |
|
|
tags: |
|
|
- biomedical |
|
|
- text-generation |
|
|
- BioGPT |
|
|
- fine-tuning |
|
|
license: "cc-by-4.0" |
|
|
datasets: |
|
|
- custom |
|
|
metrics: |
|
|
- perplexity |
|
|
- loss |
|
|
--- |
|
|
|
|
|
# TissueGPT: Fine-Tuned BioGPT for Tissue Engineering Text Generation |
|
|
|
|
|
## Model Description |
|
|
**TissueGPT** is a fine-tuned version of [BioGPT](https://huggingface.co/microsoft/BioGPT), specifically tailored for tissue engineering text generation tasks. By leveraging a dataset of biomedical research articles (titles, abstracts, and full texts), TissueGPT is designed to perform tasks such as: |
|
|
|
|
|
- Summarizing biomedical literature |
|
|
- Generating coherent biomedical text |
|
|
- Assisting with scientific writing in life sciences |
|
|
- Supporting research in tissue engineering, extracellular matrix (ECM) analysis, and related fields |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### First Round of Training |
|
|
The initial model was fine-tuned for **3 epochs**, focusing on general adaptation to the biomedical dataset. |
|
|
|
|
|
#### Hyperparameters |
|
|
- **Learning Rate**: 5e-5 |
|
|
- **Batch Size**: 8 |
|
|
- **Warmup Steps**: 500 |
|
|
- **Precision**: Mixed precision (`fp16`) |
|
|
- **Weight Decay**: 0.01 |
|
|
- **Number of Epochs**: 3 |
|
|
- **Save Checkpoints**: Every 10,000 steps, keeping the last 3 checkpoints |
|
|
|
|
|
#### Training and Validation Metrics |
|
|
| Epoch | Training Loss | Validation Loss | Perplexity | |
|
|
|-------|---------------|-----------------|------------| |
|
|
| 1 | 2.4752 | 2.4286 | 11.34 | |
|
|
| 2 | 2.3680 | 2.3708 | 10.70 | |
|
|
| 3 | 2.2954 | 2.3410 | 10.39 | |
|
|
|
|
|
--- |
|
|
|
|
|
### Second Round of Training |
|
|
To further improve performance, the model was fine-tuned for **2 additional epochs** with adjusted hyperparameters. |
|
|
|
|
|
#### Adjusted Hyperparameters |
|
|
- **Learning Rate**: 3e-5 (reduced for finer updates) |
|
|
- **Batch Size**: 64 (to utilize the GPU’s full memory) |
|
|
- **Precision**: `bf16` (optimized for NVIDIA A100) |
|
|
- **Save Checkpoints**: Every 20,000 steps |
|
|
|
|
|
#### Training and Validation Metrics |
|
|
| Epoch | Training Loss | Validation Loss | Perplexity | |
|
|
|-------|---------------|-----------------|------------| |
|
|
| 4 | 2.2396 | 2.2395 | 9.43 | |
|
|
| 5 | 2.2328 | 2.2328 | 9.32 | |
|
|
|
|
|
### Hardware Used |
|
|
- **GPU**: NVIDIA A100 80GB |
|
|
- **Framework**: PyTorch with Hugging Face Transformers library |
|
|
|
|
|
--- |
|
|
|
|
|
## Evaluation Metrics |
|
|
|
|
|
### Perplexity |
|
|
Perplexity is a key metric for evaluating language models, measuring how well the model predicts sequences of text. Lower perplexity indicates better predictive performance. |
|
|
|
|
|
- **First Round of Training**: Final perplexity = **10.39** |
|
|
- **Second Round of Training**: Final perplexity = **9.32** |
|
|
|
|
|
A lower perplexity indicates that the model generates more fluent and coherent text. |
|
|
|
|
|
### Gradient Norms |
|
|
- Tracked gradient stability during training. |
|
|
- Observed Range: **1.05–1.32**, indicating stable training. |
|
|
|
|
|
### Validation Loss |
|
|
- Decreasing validation loss across both rounds suggests effective generalization to unseen data. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Comparison |
|
|
|
|
|
| Metric | First Round | Second Round | |
|
|
|--------------------|-------------|--------------| |
|
|
| Final Validation Loss | 2.3410 | 2.2328 | |
|
|
| Final Perplexity | 10.39 | 9.32 | |
|
|
|
|
|
**Key Insights**: |
|
|
- Additional training epochs led to improved generalization and better predictive performance. |
|
|
- Perplexity improved by approximately 10% in the second round, demonstrating enhanced text fluency and coherence. |
|
|
|
|
|
--- |
|
|
|
|
|
## How to Use the Model |
|
|
|
|
|
### Install Dependencies |
|
|
Ensure you have `transformers` and `torch` installed: |
|
|
|
|
|
```bash |
|
|
pip install transformers torch |
|
|
``` |
|
|
### Load the Model |
|
|
|
|
|
``` python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
model_name = "Saeed/TissueGPT" # Replace with the uploaded repo name |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
|
|
|
|
input_text = "The extracellular matrix plays a critical role in tissue engineering because" |
|
|
inputs = tokenizer(input_text, return_tensors="pt") |
|
|
|
|
|
output = model.generate(**inputs, max_length=50) |
|
|
print(tokenizer.decode(output[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
---------- |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
- **Biomedical text generation and summarization** |
|
|
- **Assisting researchers, scientists, and medical professionals** |
|
|
- **Automated scientific writing** in domains like tissue engineering, and scaffold fabrication. |
|
|
|
|
|
---------- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- The model is fine-tuned on biomedical literature and may not generalize well to non-biomedical domains. |
|
|
- Outputs should always be validated by experts for accuracy, especially in clinical or research-critical contexts. |
|
|
|
|
|
---------- |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
- This model is intended for use in biomedical research and not for clinical diagnosis or patient care. |
|
|
- It may generate plausible-sounding but factually incorrect outputs (hallucinations). Always verify generated content. |
|
|
|
|
|
---------- |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use **TissueGPT**, please cite the following: |
|
|
|
|
|
***The citation details will be provided shortly.*** |
|
|
## License |
|
|
|
|
|
Licensed under the **CC BY 4.0** License. |
|
|
## Contact |
|
|
|
|
|
For questions, issues, or collaboration opportunities, feel free to reach out at: |
|
|
|
|
|
- **Name**: Saeed Rafieyan |
|
|
- **Website**: Sraf.ir |
|
|
- **Email**: Raf.Biomed@gmail.com |
|
|
- **LinkedIn**: https://www.linkedin.com/in/saeed-rafieyan |