Spaces:
Runtime error
A newer version of the Gradio SDK is available:
6.5.1
Model Combination Guide
Overview
This guide explains how to combine predictions from three YOLO models to produce a unified COCO-format output with only the classes defined in coco_class_mapping.
The Three Models
1. best_emanuskript_segmentation.pt
- Type: Segmentation model
- Classes: 21 classes including:
- Border, Table, Diagram, Music
- Main script black/coloured
- Variant script black/coloured
- Plain initial (coloured/highlighted/black)
- Historiated, Inhabited, Embellished
- Page Number, Quire Mark, Running header, Catchword, Gloss, Illustrations
2. best_catmus.pt
- Type: Segmentation model
- Classes: 19 classes including:
- DefaultLine, InterlinearLine
- MainZone, MarginTextZone
- DropCapitalZone, GraphicZone, MusicZone
- NumberingZone, QuireMarksZone, RunningTitleZone
- StampZone, TitlePageZone
3. best_zone_detection.pt
- Type: Detection model
- Classes: 11 zone classes:
- MainZone, MarginTextZone
- DropCapitalZone, GraphicZone, MusicZone
- NumberingZone, QuireMarksZone, RunningTitleZone
- StampZone, TitlePageZone, DigitizationArtefactZone
How It Works
Step 1: Run Model Predictions
Each model is run independently on the input image:
# Emanuskript model
emanuskript_results = model.predict(image_path, classes=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20])
# Catmus model
catmus_results = model.predict(image_path, classes=[1,7]) # DefaultLine and InterlinearLine
# Zone model
zone_results = model.predict(image_path) # All classes
Predictions are saved to JSON files in separate folders.
Step 2: Combine Predictions (ImageBatch Class)
The ImageBatch class (utils/image_batch_classes.py) handles:
Loading Images: Loads the image and gets dimensions
Loading Annotations: Loads predictions from all 3 JSON files
Unifying Names: Maps class names using
catmus_zones_mapping:DefaultLineβMain script blackInterlinearLineβGlossMainZoneβColumnDropCapitalZoneβPlain initial- coloured- etc.
Filtering Annotations:
- Removes overlapping annotations based on spatial indexing
- Uses overlap thresholds (0.3-0.8 depending on class)
- Handles conflicts between different model predictions
COCO Format Conversion: Converts to COCO JSON format
Step 3: Filter to coco_class_mapping
Only annotations with classes in coco_class_mapping are kept (25 classes total).
Key Functions
predict_annotations() (in utils/data.py)
- Runs a single model on an image
- Saves predictions to JSON
- Used by Celery tasks for async processing
unify_predictions() (in utils/data.py)
- Combines predictions from all three models
- Uses
ImageBatchto process and filter - Returns COCO format JSON
- Imports annotations into database
ImageBatch class (in utils/image_batch_classes.py)
- Main class for combining predictions
- Methods:
load_images(): Load image filesload_annotations(): Load predictions from JSON filesunify_names(): Map class names to coco_class_mappingfilter_annotations(): Remove overlapping annotationsreturn_coco_file(): Generate COCO JSON
Usage Example
from ultralytics import YOLO
from utils.image_batch_classes import ImageBatch
# 1. Run models (or use predict_annotations function)
# ... save predictions to JSON files ...
# 2. Combine predictions
image_batch = ImageBatch(
image_folder="path/to/images",
catmus_labels_folder="path/to/catmus/predictions",
emanuskript_labels_folder="path/to/emanuskript/predictions",
zone_labels_folder="path/to/zone/predictions"
)
image_batch.load_images()
image_batch.load_annotations()
image_batch.unify_names()
# 3. Get COCO format
coco_json = image_batch.return_coco_file()
Running the Test Script
python3 test_combined_models.py
This will:
- Run all three models on
bnf-naf-10039__page-001-of-004.jpg - Combine and filter predictions
- Save results to
combined_predictions.json - Print a summary of detected classes
Output Format
The final output is a COCO-format JSON file with:
- images: Image metadata (id, width, height, filename)
- categories: List of category definitions (25 classes from coco_class_mapping)
- annotations: List of annotations with:
id: Annotation IDimage_id: Associated image IDcategory_id: Class ID from coco_class_mappingsegmentation: Polygon coordinatesbbox: Bounding box [x, y, width, height]area: Polygon area
Class Mapping
The catmus_zones_mapping in image_batch_classes.py maps:
- Catmus/Zone model classes β coco_class_mapping classes
- Example:
DefaultLineβMain script black - Example:
MainZoneβColumn
Only classes that map to coco_class_mapping are included in the final output.