File size: 4,942 Bytes
989ec3c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
# Model Combination Guide

## Overview

This guide explains how to combine predictions from three YOLO models to produce a unified COCO-format output with only the classes defined in `coco_class_mapping`.

## The Three Models

### 1. **best_emanuskript_segmentation.pt**
- **Type**: Segmentation model
- **Classes**: 21 classes including:
  - Border, Table, Diagram, Music
  - Main script black/coloured
  - Variant script black/coloured
  - Plain initial (coloured/highlighted/black)
  - Historiated, Inhabited, Embellished
  - Page Number, Quire Mark, Running header, Catchword, Gloss, Illustrations

### 2. **best_catmus.pt**
- **Type**: Segmentation model
- **Classes**: 19 classes including:
  - DefaultLine, InterlinearLine
  - MainZone, MarginTextZone
  - DropCapitalZone, GraphicZone, MusicZone
  - NumberingZone, QuireMarksZone, RunningTitleZone
  - StampZone, TitlePageZone

### 3. **best_zone_detection.pt**
- **Type**: Detection model
- **Classes**: 11 zone classes:
  - MainZone, MarginTextZone
  - DropCapitalZone, GraphicZone, MusicZone
  - NumberingZone, QuireMarksZone, RunningTitleZone
  - StampZone, TitlePageZone, DigitizationArtefactZone

## How It Works

### Step 1: Run Model Predictions
Each model is run independently on the input image:
```python
# Emanuskript model
emanuskript_results = model.predict(image_path, classes=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20])

# Catmus model  
catmus_results = model.predict(image_path, classes=[1,7])  # DefaultLine and InterlinearLine

# Zone model
zone_results = model.predict(image_path)  # All classes
```

Predictions are saved to JSON files in separate folders.

### Step 2: Combine Predictions (ImageBatch Class)

The `ImageBatch` class (`utils/image_batch_classes.py`) handles:

1. **Loading Images**: Loads the image and gets dimensions
2. **Loading Annotations**: Loads predictions from all 3 JSON files
3. **Unifying Names**: Maps class names using `catmus_zones_mapping`:
   - `DefaultLine` β†’ `Main script black`
   - `InterlinearLine` β†’ `Gloss`
   - `MainZone` β†’ `Column`
   - `DropCapitalZone` β†’ `Plain initial- coloured`
   - etc.

4. **Filtering Annotations**: 
   - Removes overlapping annotations based on spatial indexing
   - Uses overlap thresholds (0.3-0.8 depending on class)
   - Handles conflicts between different model predictions

5. **COCO Format Conversion**: Converts to COCO JSON format

### Step 3: Filter to coco_class_mapping

Only annotations with classes in `coco_class_mapping` are kept (25 classes total).

## Key Functions

### `predict_annotations()` (in `utils/data.py`)
- Runs a single model on an image
- Saves predictions to JSON
- Used by Celery tasks for async processing

### `unify_predictions()` (in `utils/data.py`)
- Combines predictions from all three models
- Uses `ImageBatch` to process and filter
- Returns COCO format JSON
- Imports annotations into database

### `ImageBatch` class (in `utils/image_batch_classes.py`)
- Main class for combining predictions
- Methods:
  - `load_images()`: Load image files
  - `load_annotations()`: Load predictions from JSON files
  - `unify_names()`: Map class names to coco_class_mapping
  - `filter_annotations()`: Remove overlapping annotations
  - `return_coco_file()`: Generate COCO JSON

## Usage Example

```python
from ultralytics import YOLO
from utils.image_batch_classes import ImageBatch

# 1. Run models (or use predict_annotations function)
# ... save predictions to JSON files ...

# 2. Combine predictions
image_batch = ImageBatch(
    image_folder="path/to/images",
    catmus_labels_folder="path/to/catmus/predictions",
    emanuskript_labels_folder="path/to/emanuskript/predictions",
    zone_labels_folder="path/to/zone/predictions"
)

image_batch.load_images()
image_batch.load_annotations()
image_batch.unify_names()

# 3. Get COCO format
coco_json = image_batch.return_coco_file()
```

## Running the Test Script

```bash
python3 test_combined_models.py
```

This will:
1. Run all three models on `bnf-naf-10039__page-001-of-004.jpg`
2. Combine and filter predictions
3. Save results to `combined_predictions.json`
4. Print a summary of detected classes

## Output Format

The final output is a COCO-format JSON file with:
- **images**: Image metadata (id, width, height, filename)
- **categories**: List of category definitions (25 classes from coco_class_mapping)
- **annotations**: List of annotations with:
  - `id`: Annotation ID
  - `image_id`: Associated image ID
  - `category_id`: Class ID from coco_class_mapping
  - `segmentation`: Polygon coordinates
  - `bbox`: Bounding box [x, y, width, height]
  - `area`: Polygon area

## Class Mapping

The `catmus_zones_mapping` in `image_batch_classes.py` maps:
- Catmus/Zone model classes β†’ coco_class_mapping classes
- Example: `DefaultLine` β†’ `Main script black`
- Example: `MainZone` β†’ `Column`

Only classes that map to `coco_class_mapping` are included in the final output.