File size: 5,337 Bytes
95f8578 3e6da1c b144b5b 3b4869b b144b5b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
---
language: "en"
tags:
- biomedical
- text-generation
- BioGPT
- fine-tuning
license: "cc-by-4.0"
datasets:
- custom
metrics:
- perplexity
- loss
---
# TissueGPT: Fine-Tuned BioGPT for Tissue Engineering Text Generation
## Model Description
**TissueGPT** is a fine-tuned version of [BioGPT](https://huggingface.co/microsoft/BioGPT), specifically tailored for tissue engineering text generation tasks. By leveraging a dataset of biomedical research articles (titles, abstracts, and full texts), TissueGPT is designed to perform tasks such as:
- Summarizing biomedical literature
- Generating coherent biomedical text
- Assisting with scientific writing in life sciences
- Supporting research in tissue engineering, extracellular matrix (ECM) analysis, and related fields
---
## Training Details
### First Round of Training
The initial model was fine-tuned for **3 epochs**, focusing on general adaptation to the biomedical dataset.
#### Hyperparameters
- **Learning Rate**: 5e-5
- **Batch Size**: 8
- **Warmup Steps**: 500
- **Precision**: Mixed precision (`fp16`)
- **Weight Decay**: 0.01
- **Number of Epochs**: 3
- **Save Checkpoints**: Every 10,000 steps, keeping the last 3 checkpoints
#### Training and Validation Metrics
| Epoch | Training Loss | Validation Loss | Perplexity |
|-------|---------------|-----------------|------------|
| 1 | 2.4752 | 2.4286 | 11.34 |
| 2 | 2.3680 | 2.3708 | 10.70 |
| 3 | 2.2954 | 2.3410 | 10.39 |
---
### Second Round of Training
To further improve performance, the model was fine-tuned for **2 additional epochs** with adjusted hyperparameters.
#### Adjusted Hyperparameters
- **Learning Rate**: 3e-5 (reduced for finer updates)
- **Batch Size**: 64 (to utilize the GPU’s full memory)
- **Precision**: `bf16` (optimized for NVIDIA A100)
- **Save Checkpoints**: Every 20,000 steps
#### Training and Validation Metrics
| Epoch | Training Loss | Validation Loss | Perplexity |
|-------|---------------|-----------------|------------|
| 4 | 2.2396 | 2.2395 | 9.43 |
| 5 | 2.2328 | 2.2328 | 9.32 |
### Hardware Used
- **GPU**: NVIDIA A100 80GB
- **Framework**: PyTorch with Hugging Face Transformers library
---
## Evaluation Metrics
### Perplexity
Perplexity is a key metric for evaluating language models, measuring how well the model predicts sequences of text. Lower perplexity indicates better predictive performance.
- **First Round of Training**: Final perplexity = **10.39**
- **Second Round of Training**: Final perplexity = **9.32**
A lower perplexity indicates that the model generates more fluent and coherent text.
### Gradient Norms
- Tracked gradient stability during training.
- Observed Range: **1.05–1.32**, indicating stable training.
### Validation Loss
- Decreasing validation loss across both rounds suggests effective generalization to unseen data.
---
## Model Comparison
| Metric | First Round | Second Round |
|--------------------|-------------|--------------|
| Final Validation Loss | 2.3410 | 2.2328 |
| Final Perplexity | 10.39 | 9.32 |
**Key Insights**:
- Additional training epochs led to improved generalization and better predictive performance.
- Perplexity improved by approximately 10% in the second round, demonstrating enhanced text fluency and coherence.
---
## How to Use the Model
### Install Dependencies
Ensure you have `transformers` and `torch` installed:
```bash
pip install transformers torch
```
### Load the Model
``` python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "Saeed/TissueGPT" # Replace with the uploaded repo name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
input_text = "The extracellular matrix plays a critical role in tissue engineering because"
inputs = tokenizer(input_text, return_tensors="pt")
output = model.generate(**inputs, max_length=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))
```
----------
## Intended Use
- **Biomedical text generation and summarization**
- **Assisting researchers, scientists, and medical professionals**
- **Automated scientific writing** in domains like tissue engineering, and scaffold fabrication.
----------
## Limitations
- The model is fine-tuned on biomedical literature and may not generalize well to non-biomedical domains.
- Outputs should always be validated by experts for accuracy, especially in clinical or research-critical contexts.
----------
## Ethical Considerations
- This model is intended for use in biomedical research and not for clinical diagnosis or patient care.
- It may generate plausible-sounding but factually incorrect outputs (hallucinations). Always verify generated content.
----------
## Citation
If you use **TissueGPT**, please cite the following:
***The citation details will be provided shortly.***
## License
Licensed under the **CC BY 4.0** License.
## Contact
For questions, issues, or collaboration opportunities, feel free to reach out at:
- **Name**: Saeed Rafieyan
- **Website**: Sraf.ir
- **Email**: Raf.Biomed@gmail.com
- **LinkedIn**: https://www.linkedin.com/in/saeed-rafieyan |