Update README.md
Browse files
README.md
CHANGED
|
@@ -16,45 +16,6 @@ SerialNo_Height_Weight_Gender_Age.png/jpg
|
|
| 16 |
Example: 1021_5.5h_51w_female_26a.png
|
| 17 |
```
|
| 18 |
|
| 19 |
-
## Setup
|
| 20 |
-
|
| 21 |
-
### 1. Install Dependencies
|
| 22 |
-
|
| 23 |
-
```bash
|
| 24 |
-
pip install -r ../requirements.txt
|
| 25 |
-
```
|
| 26 |
-
|
| 27 |
-
Key dependencies:
|
| 28 |
-
- `torch>=2.0.0` - PyTorch for deep learning
|
| 29 |
-
- `transformers>=4.30.0` - Hugging Face transformers library
|
| 30 |
-
- `accelerate>=0.20.0` - For efficient training
|
| 31 |
-
|
| 32 |
-
### 2. Verify Dataset Location
|
| 33 |
-
|
| 34 |
-
Ensure your dataset is located at:
|
| 35 |
-
```
|
| 36 |
-
D:\fit_model\finetune_model\Celeb-FBI Dataset
|
| 37 |
-
```
|
| 38 |
-
|
| 39 |
-
## Usage
|
| 40 |
-
|
| 41 |
-
### Step 1: Parse Dataset (Optional)
|
| 42 |
-
|
| 43 |
-
If you haven't created the CSV file yet, run:
|
| 44 |
-
|
| 45 |
-
```bash
|
| 46 |
-
python dataset_parser.py
|
| 47 |
-
```
|
| 48 |
-
|
| 49 |
-
This will create `dataset_labels.csv` with parsed height and weight labels from filenames.
|
| 50 |
-
|
| 51 |
-
### Step 2: Fine-tune the Model
|
| 52 |
-
|
| 53 |
-
Run the training script:
|
| 54 |
-
|
| 55 |
-
```bash
|
| 56 |
-
python train_vit.py
|
| 57 |
-
```
|
| 58 |
|
| 59 |
#### Training Parameters (Optimized for 4GB GPU)
|
| 60 |
|
|
@@ -65,18 +26,7 @@ The script uses memory-efficient techniques:
|
|
| 65 |
- **Learning rate**: 2e-5 (standard for fine-tuning)
|
| 66 |
- **Epochs**: 10 (adjustable)
|
| 67 |
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
```bash
|
| 71 |
-
python train_vit.py \
|
| 72 |
-
--dataset_dir "D:\fit_model\finetune_model\Celeb-FBI Dataset" \
|
| 73 |
-
--csv_file "D:\fit_model\finetune_model\dataset_labels.csv" \
|
| 74 |
-
--output_dir "D:\fit_model\finetune_model\checkpoints" \
|
| 75 |
-
--batch_size 4 \
|
| 76 |
-
--accumulation_steps 8 \
|
| 77 |
-
--epochs 10 \
|
| 78 |
-
--learning_rate 2e-5
|
| 79 |
-
```
|
| 80 |
|
| 81 |
**Arguments:**
|
| 82 |
- `--dataset_dir`: Path to Celeb-FBI Dataset directory
|
|
@@ -104,15 +54,6 @@ The training script includes several optimizations:
|
|
| 104 |
3. **Mixed Precision**: Uses FP16 training to reduce memory usage by ~50%
|
| 105 |
4. **Efficient Data Loading**: Uses `pin_memory` and multiple workers for faster data transfer
|
| 106 |
|
| 107 |
-
## Output Files
|
| 108 |
-
|
| 109 |
-
After training, the following files will be created in the output directory:
|
| 110 |
-
|
| 111 |
-
- `best_model.pt`: Best model checkpoint (lowest validation loss)
|
| 112 |
-
- `final_model.pt`: Final model after all epochs
|
| 113 |
-
- `checkpoint_epoch_N.pt`: Periodic checkpoints every 5 epochs
|
| 114 |
-
- `dataset_stats.json`: Dataset statistics (mean, std) for denormalization
|
| 115 |
-
|
| 116 |
## Loading the Trained Model
|
| 117 |
|
| 118 |
```python
|
|
@@ -120,7 +61,7 @@ import torch
|
|
| 120 |
from model import ViTHeightWeightModel
|
| 121 |
|
| 122 |
# Load checkpoint
|
| 123 |
-
checkpoint = torch.load('
|
| 124 |
dataset_stats = checkpoint['dataset_stats']
|
| 125 |
|
| 126 |
# Initialize model
|
|
@@ -140,7 +81,7 @@ import torch
|
|
| 140 |
from model import ViTHeightWeightModel
|
| 141 |
|
| 142 |
# Load model and processor
|
| 143 |
-
checkpoint = torch.load('
|
| 144 |
model = ViTHeightWeightModel(model_name=checkpoint['model_name'])
|
| 145 |
model.load_state_dict(checkpoint['model_state_dict'])
|
| 146 |
model.eval()
|
|
@@ -186,31 +127,6 @@ If you encounter OOM errors:
|
|
| 186 |
- Use SSD storage for faster data loading
|
| 187 |
- Consider using a smaller model variant if needed
|
| 188 |
|
| 189 |
-
## Files Structure
|
| 190 |
-
|
| 191 |
-
```
|
| 192 |
-
finetune_model/
|
| 193 |
-
βββ Celeb-FBI Dataset/ # Dataset directory
|
| 194 |
-
βββ dataset_parser.py # Parse filenames to extract labels
|
| 195 |
-
βββ vit_dataset.py # PyTorch Dataset class
|
| 196 |
-
βββ model.py # ViT model architecture
|
| 197 |
-
βββ train_vit.py # Main training script
|
| 198 |
-
βββ dataset_labels.csv # Generated CSV with labels
|
| 199 |
-
βββ checkpoints/ # Saved model checkpoints
|
| 200 |
-
β βββ best_model.pt
|
| 201 |
-
β βββ final_model.pt
|
| 202 |
-
β βββ dataset_stats.json
|
| 203 |
-
βββ README.md # This file
|
| 204 |
-
```
|
| 205 |
-
|
| 206 |
-
## Notes
|
| 207 |
-
|
| 208 |
-
- The model normalizes height and weight during training for better convergence
|
| 209 |
-
- Training time: ~2-4 hours on RTX 3050 (4GB) for 10 epochs
|
| 210 |
-
- The model uses a multi-task approach, learning height and weight simultaneously
|
| 211 |
-
- Early stopping can be implemented by monitoring validation loss
|
| 212 |
-
|
| 213 |
-
|
| 214 |
|
| 215 |
|
| 216 |
|
|
|
|
| 16 |
Example: 1021_5.5h_51w_female_26a.png
|
| 17 |
```
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
#### Training Parameters (Optimized for 4GB GPU)
|
| 21 |
|
|
|
|
| 26 |
- **Learning rate**: 2e-5 (standard for fine-tuning)
|
| 27 |
- **Epochs**: 10 (adjustable)
|
| 28 |
|
| 29 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
|
| 31 |
**Arguments:**
|
| 32 |
- `--dataset_dir`: Path to Celeb-FBI Dataset directory
|
|
|
|
| 54 |
3. **Mixed Precision**: Uses FP16 training to reduce memory usage by ~50%
|
| 55 |
4. **Efficient Data Loading**: Uses `pin_memory` and multiple workers for faster data transfer
|
| 56 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
## Loading the Trained Model
|
| 58 |
|
| 59 |
```python
|
|
|
|
| 61 |
from model import ViTHeightWeightModel
|
| 62 |
|
| 63 |
# Load checkpoint
|
| 64 |
+
checkpoint = torch.load('Rithankoushik/Finetuned_VITmodel/best_model.pt')
|
| 65 |
dataset_stats = checkpoint['dataset_stats']
|
| 66 |
|
| 67 |
# Initialize model
|
|
|
|
| 81 |
from model import ViTHeightWeightModel
|
| 82 |
|
| 83 |
# Load model and processor
|
| 84 |
+
checkpoint = torch.load('Rithankoushik/Finetuned_VITmodel/best_model.pt')
|
| 85 |
model = ViTHeightWeightModel(model_name=checkpoint['model_name'])
|
| 86 |
model.load_state_dict(checkpoint['model_state_dict'])
|
| 87 |
model.eval()
|
|
|
|
| 127 |
- Use SSD storage for faster data loading
|
| 128 |
- Consider using a smaller model variant if needed
|
| 129 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 130 |
|
| 131 |
|
| 132 |
|