Spaces:

Wen1201
/

bert_eng

Sleeping

App Files Files Community

bert_eng / README.md

Wen1201

Upload 3 files

b54de6a verified 5 months ago

preview code

raw

history blame contribute delete

7.27 kB

A newer version of the Gradio SDK is available: 6.12.0

Upgrade

metadata

title: BERT Second Fine-tuning Platform
emoji: 🥼
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.36.0
app_file: app.py
pinned: false

🥼 BERT Breast Cancer Survival Prediction - Complete Second Fine-tuning Platform

Complete BERT second fine-tuning system supporting the full workflow from first fine-tuning to second fine-tuning, with multi-model comparison on new data.

🌟 Core Features

1️⃣ First Fine-tuning

Train from pure BERT
Supports three fine-tuning methods:
- Full Fine-tuning: Train all parameters
- LoRA: Low-rank adaptation, parameter efficient
- AdaLoRA: Adaptive LoRA, dynamically adjusts rank
Automatically compare pure BERT vs first fine-tuning performance

2️⃣ Second Fine-tuning

Continue training based on first fine-tuning model
Use new training data
Automatically inherit first fine-tuning method
Suitable for incremental learning and domain adaptation

3️⃣ Test on New Data

Upload new test data
Compare up to 3 models simultaneously:
- Pure BERT (Baseline)
- First fine-tuning model
- Second fine-tuning model
Display all evaluation metrics side by side

4️⃣ Model Prediction

Select any trained model
Input medical text for prediction
Display predictions from both non-finetuned and finetuned models

📋 Data Format

CSV file must contain the following columns:

Text: Medical record text (English)
label: Label (0=Survival, 1=Death)

Example:

Text,label
"Patient is a 45-year-old female with stage II breast cancer...",0
"65-year-old woman diagnosed with triple-negative breast cancer...",1

🚀 Usage Workflow

Step 1: First Fine-tuning

Go to "1️⃣ First Fine-tuning" page
Upload training data A (CSV)
Select fine-tuning method (recommend starting with Full Fine-tuning)
Adjust training parameters:
- Weight Multiplier: 0.8 (handle imbalanced data)
- Training Epochs: 8-10
- Learning Rate: 2e-5
Click "Start First Fine-tuning"
Wait for training to complete, review results

Step 2: Second Fine-tuning

Go to "2️⃣ Second Fine-tuning" page
Click "🔄 Refresh Model List"
Select first fine-tuning model
Upload new training data B
Adjust training parameters (recommended):
- Training Epochs: 3-5 (fewer than first)
- Learning Rate: 1e-5 (smaller than first)
Click "Start Second Fine-tuning"
Wait for training to complete

Step 3: Test on New Data

Go to "3️⃣ Test on New Data" page
Upload test data C
Select models to compare:
- Pure BERT: Select "Evaluate Pure BERT"
- First fine-tuning: Select from dropdown
- Second fine-tuning: Select from dropdown
Click "Start Testing"
View comparison results for all three models

Step 4: Prediction

Go to "4️⃣ Model Prediction" page
Select model to use
Input medical text
Click "Start Prediction"
View prediction results

🎯 Fine-tuning Method Comparison

Method	Parameters	Training Speed	Memory Usage	Performance
Full Fine-tuning	100%	1x (baseline)	High	Best
LoRA	~1%	3-5x faster	Low	Good
AdaLoRA	~1%	3-5x faster	Low	Good

💡 Second Fine-tuning Best Practices

When to Use Second Fine-tuning?

Domain Adaptation
- First: Use general medical data
- Second: Use specific hospital/department data
Incremental Learning
- First: Use historical data
- Second: Add newly collected data
Data Scarcity
- First: Use large amount of related domain data
- Second: Use small amount of target domain data

Parameter Adjustment Recommendations

Parameter	First Fine-tuning	Second Fine-tuning	Reason
Epochs	8-10	3-5	Avoid overfitting
Learning Rate	2e-5	1e-5	Preserve learned knowledge
Warmup Steps	200	100	Less warmup needed
Weight Multiplier	Adjust based on data	Adjust based on new data	Handle imbalance

Important Notes

⚠️ Critical Reminders:

Second fine-tuning automatically uses first fine-tuning method, cannot change
Recommend smaller learning rate for second fine-tuning to avoid "catastrophic forgetting"
If second data differs greatly from first, may need more epochs
Always test on new data to ensure no performance degradation

📊 Evaluation Metrics Explanation

Metric	Description	Use Case
F1 Score	Harmonic mean of precision and recall	Balanced evaluation, general metric
Accuracy	Overall accuracy	Use when data is balanced
Precision	Accuracy of death predictions	Optimize to avoid false positives
Recall	Proportion of actual deaths identified	Optimize to avoid missed diagnoses
Sensitivity	Same as Recall	Commonly used in medical scenarios
Specificity	Proportion of actual survivals identified	Avoid overtreatment
AUC	Area under ROC curve	Overall classification ability

🔧 Technical Details

Training Process

Data Preparation
- Load CSV
- Maintain original class ratio
- Tokenization (max_length=256)
- 80/20 train/validation split
Model Initialization
- First: Load from bert-base-uncased
- Second: Load from first fine-tuning model
- Apply PEFT configuration (if using LoRA/AdaLoRA)
Training
- Use class weights to handle imbalance
- Early stopping (based on validation set)
- Save best model
Evaluation
- Evaluate on validation set
- Calculate all metrics
- Generate confusion matrix

Model Storage

Model files: ./breast_cancer_bert_{method}_{type}_{timestamp}/
Model list: ./saved_models_list.json
Includes all training information and hyperparameters

🐛 Common Questions

Q1: Why can't I change methods in second fine-tuning?

A: Because different methods have different parameter structures. For example, LoRA adds low-rank matrices; if you switch to Full Fine-tuning, these parameters would be lost.

Q2: How much data should second fine-tuning have?

A: Recommend at least 100 samples, but can be less than first. If data is too scarce, may overfit.

Q3: How to choose optimization metric?

Medical scenarios usually prioritize Recall (avoid missed diagnoses)
If false positives have high cost, choose Precision
For balanced scenarios, choose F1 Score

Q4: What if GPU memory insufficient?

Use LoRA or AdaLoRA (reduce 90% memory)
Reduce batch size
Reduce max_length

Q5: Training takes too long?

Use LoRA/AdaLoRA (3-5x faster)
Reduce epochs
Increase batch size (if memory allows)

📝 Version Information

Version: 1.0.0
Python: 3.10+
Main Dependencies:
- transformers 4.36.0
- torch 2.1.0
- peft 0.7.1
- gradio 4.36.0

📄 License

This project completely preserves your original program logic, only adding second fine-tuning and testing features.

🙏 Acknowledgments

Developed based on BERT model and Hugging Face Transformers library.