TrafficSignDetector / TRAIN_PROMPT.md
VietCat's picture
Add detailed training prompt and guidelines
ec14871

YOLOv8 Traffic Sign Detection Training Script Prompt

Context

Current model has 3M parameters and proper architecture, but underfitted - detects 300 objects with confidence < 0.0001. Need retraining with proper hyperparameters.

Requirements

Write a YOLOv8 training script with the following specifications:

1. Dataset Setup

  • Source: GTSRB (German Traffic Sign Recognition Benchmark) - 40,000+ images
  • Format: YOLO format (images/ and labels/ directories)
  • Structure:
    dataset/
      images/
        train/  (70% of data)
        val/    (30% of data)
      labels/
        train/
        val/
    
  • Classes: 43 traffic sign classes (see config.yaml for class names)
  • Class mapping file: dataset.yaml with proper format

2. Training Hyperparameters

  • Model: YOLOv8n (nano - fastest) or YOLOv8s (small - better accuracy)
  • Epochs: 150-200 (more than 100 for proper convergence)
  • Batch size: 16-32 (adjust based on GPU memory)
  • Image size: 640x640 (match inference size)
  • Learning rate:
    • Initial (lr0): 0.01
    • Final (lrf): 0.01
    • Warmup epochs: 3
  • Optimizer: SGD (not Adam) for YOLOv8
  • Weight decay: 0.0005
  • Momentum: 0.937

3. Augmentation Settings (Critical!)

- HSV augmentation: h=0.015, s=0.7, v=0.4
- Rotation: degrees=10
- Translation: translate=0.1
- Scale: scale=0.5
- Flip: flipud=0.5, fliplr=0.5
- Mosaic: mosaic=1.0
- Mixup: mixup=0.1

4. Training Features

  • Early stopping: patience=20 (stop if val loss doesn't improve)
  • Validation monitoring: track mAP50, precision, recall
  • Model checkpointing: save best.pt when val metric improves
  • Logging: TensorBoard or Weights&Biases integration (optional)

5. Output Structure

runs/detect/train/
β”œβ”€β”€ weights/
β”‚   β”œβ”€β”€ best.pt      (use this for inference)
β”‚   └── last.pt
β”œβ”€β”€ results.png      (training curves)
└── events.out.tfevents.*

6. Post-Training Validation

After training:

  • Validate on test set
  • Compute metrics (mAP50, mAP50-95, precision, recall)
  • Test on sample images (visual inspection)
  • Compare confidence scores (should be > 0.5 for good detections)

7. Python Libraries

Required:
- ultralytics>=8.0.0
- torch
- torchvision
- opencv-python
- numpy
- pyyaml

Optional:
- tensorboard (for visualization)
- wandb (for cloud logging)

8. Code Structure

  1. Setup phase: Load config, prepare dataset.yaml
  2. Model initialization: Load pretrained YOLOv8n/s
  3. Training phase: Call model.train() with params
  4. Validation phase: Evaluate on val set
  5. Testing phase: Inference on test images
  6. Save phase: Export best.pt to deployment location

9. Expected Outcomes

After proper training (150 epochs):

  • mAP50: > 0.7 (good)
  • Precision: > 0.75
  • Recall: > 0.75
  • Confidence scores: majority > 0.3 (not 0.0001!)
  • Training time: 2-6 hours on GPU (or 24+ hours on CPU)

10. Deployment

# After training, replace model path in config.yaml:
model:
  path: 'runs/detect/train/weights/best.pt'
  confidence_threshold: 0.25  (adjust based on precision/recall tradeoff)

Tips

  1. Monitor training curves (loss should decrease smoothly)
  2. If overfitting: increase augmentation or reduce epochs
  3. If underfitting: increase epochs or reduce augmentation
  4. Use GPU if possible (50x faster than CPU)
  5. Save weights regularly (every 10 epochs)
  6. Validate on completely unseen test set
  7. Test confidence distribution on real images

Example Command Structure

python train.py \
  --data dataset.yaml \
  --model yolov8n.pt \
  --epochs 150 \
  --batch 32 \
  --imgsz 640 \
  --device 0 \
  --patience 20 \
  --augment

Use this prompt to:

  • Ask an AI to write the complete training script
  • Guide your own script writing
  • Review if training script meets these requirements