Spaces:
Runtime error
Runtime error
Add test_combined_models.py and compare/ folder (excluding cvat_project_7_export and Annika 2 folders)
0a216c0
A newer version of the Gradio SDK is available:
6.5.1
Model Comparison Scripts
This directory contains scripts to compare old models vs new models vs ground truth annotations.
Files
original_annotations.py: Parses CVAT XML annotations and converts to COCO formatold_models.py: Runs old models (Line, Border, Zones) and converts predictions to COCOnew_models.py: Runs new models (emanuskript, catmus, zone) and converts predictions to COCOcompare.py: Main script that orchestrates the comparison and calculates metrics
Setup
- Install required dependencies:
pip install pycocotools numpy pillow matplotlib ultralytics
- Ensure model files are in the project root:
- Old models:
best_line_detection_yoloe (1).pt,border_model_weights.pt,zones_model_weights.pt - New models:
best_emanuskript_segmentation.pt,best_catmus.pt,best_zone_detection.pt
- Old models:
Usage
Run the main comparison script:
cd /home/hasan/layout/compare/data
python compare.py
The script will:
- Load ground truth annotations from
Aleyna 1 (2024)/Annotations/annotations.xml - Run old models on all images in
Aleyna 1 (2024)/Images - Run new models on all images
- Calculate metrics (mAP@50, mAP@[.50:.95], Precision, Recall)
- Create side-by-side visualizations for each image
Output
Results are saved to results/ directory:
ground_truth.json: Ground truth in COCO formatold_models_merged.json: Old models predictionsnew_models_merged.json: New models predictionsmetrics.json: Calculated metrics for both model setsvisualizations/: Side-by-side comparison images
Metrics
The comparison calculates:
- mAP@50: Mean Average Precision at IoU=0.50
- mAP@[.50:.95]: Mean Average Precision averaged over IoU thresholds from 0.50 to 0.95
- Precision: Approximated from mAP@50
- Recall: Maximum recall with 100 detections
- F1 Score: Harmonic mean of Precision and Recall
Notes
- The CVAT XML parser handles RLE (Run-Length Encoding) format masks
- Category alignment is performed automatically to match ground truth categories
- Images are processed sequentially - batch processing may take time
- Visualizations show: Original+GT | Old Models | New Models side-by-side