Spaces:
Runtime error
Runtime error
Add missing important files: _app_.py, utils/, CVAT_download/, manifest.json, and documentation
989ec3c
| # Model Combination Guide | |
| ## Overview | |
| This guide explains how to combine predictions from three YOLO models to produce a unified COCO-format output with only the classes defined in `coco_class_mapping`. | |
| ## The Three Models | |
| ### 1. **best_emanuskript_segmentation.pt** | |
| - **Type**: Segmentation model | |
| - **Classes**: 21 classes including: | |
| - Border, Table, Diagram, Music | |
| - Main script black/coloured | |
| - Variant script black/coloured | |
| - Plain initial (coloured/highlighted/black) | |
| - Historiated, Inhabited, Embellished | |
| - Page Number, Quire Mark, Running header, Catchword, Gloss, Illustrations | |
| ### 2. **best_catmus.pt** | |
| - **Type**: Segmentation model | |
| - **Classes**: 19 classes including: | |
| - DefaultLine, InterlinearLine | |
| - MainZone, MarginTextZone | |
| - DropCapitalZone, GraphicZone, MusicZone | |
| - NumberingZone, QuireMarksZone, RunningTitleZone | |
| - StampZone, TitlePageZone | |
| ### 3. **best_zone_detection.pt** | |
| - **Type**: Detection model | |
| - **Classes**: 11 zone classes: | |
| - MainZone, MarginTextZone | |
| - DropCapitalZone, GraphicZone, MusicZone | |
| - NumberingZone, QuireMarksZone, RunningTitleZone | |
| - StampZone, TitlePageZone, DigitizationArtefactZone | |
| ## How It Works | |
| ### Step 1: Run Model Predictions | |
| Each model is run independently on the input image: | |
| ```python | |
| # Emanuskript model | |
| emanuskript_results = model.predict(image_path, classes=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20]) | |
| # Catmus model | |
| catmus_results = model.predict(image_path, classes=[1,7]) # DefaultLine and InterlinearLine | |
| # Zone model | |
| zone_results = model.predict(image_path) # All classes | |
| ``` | |
| Predictions are saved to JSON files in separate folders. | |
| ### Step 2: Combine Predictions (ImageBatch Class) | |
| The `ImageBatch` class (`utils/image_batch_classes.py`) handles: | |
| 1. **Loading Images**: Loads the image and gets dimensions | |
| 2. **Loading Annotations**: Loads predictions from all 3 JSON files | |
| 3. **Unifying Names**: Maps class names using `catmus_zones_mapping`: | |
| - `DefaultLine` β `Main script black` | |
| - `InterlinearLine` β `Gloss` | |
| - `MainZone` β `Column` | |
| - `DropCapitalZone` β `Plain initial- coloured` | |
| - etc. | |
| 4. **Filtering Annotations**: | |
| - Removes overlapping annotations based on spatial indexing | |
| - Uses overlap thresholds (0.3-0.8 depending on class) | |
| - Handles conflicts between different model predictions | |
| 5. **COCO Format Conversion**: Converts to COCO JSON format | |
| ### Step 3: Filter to coco_class_mapping | |
| Only annotations with classes in `coco_class_mapping` are kept (25 classes total). | |
| ## Key Functions | |
| ### `predict_annotations()` (in `utils/data.py`) | |
| - Runs a single model on an image | |
| - Saves predictions to JSON | |
| - Used by Celery tasks for async processing | |
| ### `unify_predictions()` (in `utils/data.py`) | |
| - Combines predictions from all three models | |
| - Uses `ImageBatch` to process and filter | |
| - Returns COCO format JSON | |
| - Imports annotations into database | |
| ### `ImageBatch` class (in `utils/image_batch_classes.py`) | |
| - Main class for combining predictions | |
| - Methods: | |
| - `load_images()`: Load image files | |
| - `load_annotations()`: Load predictions from JSON files | |
| - `unify_names()`: Map class names to coco_class_mapping | |
| - `filter_annotations()`: Remove overlapping annotations | |
| - `return_coco_file()`: Generate COCO JSON | |
| ## Usage Example | |
| ```python | |
| from ultralytics import YOLO | |
| from utils.image_batch_classes import ImageBatch | |
| # 1. Run models (or use predict_annotations function) | |
| # ... save predictions to JSON files ... | |
| # 2. Combine predictions | |
| image_batch = ImageBatch( | |
| image_folder="path/to/images", | |
| catmus_labels_folder="path/to/catmus/predictions", | |
| emanuskript_labels_folder="path/to/emanuskript/predictions", | |
| zone_labels_folder="path/to/zone/predictions" | |
| ) | |
| image_batch.load_images() | |
| image_batch.load_annotations() | |
| image_batch.unify_names() | |
| # 3. Get COCO format | |
| coco_json = image_batch.return_coco_file() | |
| ``` | |
| ## Running the Test Script | |
| ```bash | |
| python3 test_combined_models.py | |
| ``` | |
| This will: | |
| 1. Run all three models on `bnf-naf-10039__page-001-of-004.jpg` | |
| 2. Combine and filter predictions | |
| 3. Save results to `combined_predictions.json` | |
| 4. Print a summary of detected classes | |
| ## Output Format | |
| The final output is a COCO-format JSON file with: | |
| - **images**: Image metadata (id, width, height, filename) | |
| - **categories**: List of category definitions (25 classes from coco_class_mapping) | |
| - **annotations**: List of annotations with: | |
| - `id`: Annotation ID | |
| - `image_id`: Associated image ID | |
| - `category_id`: Class ID from coco_class_mapping | |
| - `segmentation`: Polygon coordinates | |
| - `bbox`: Bounding box [x, y, width, height] | |
| - `area`: Polygon area | |
| ## Class Mapping | |
| The `catmus_zones_mapping` in `image_batch_classes.py` maps: | |
| - Catmus/Zone model classes β coco_class_mapping classes | |
| - Example: `DefaultLine` β `Main script black` | |
| - Example: `MainZone` β `Column` | |
| Only classes that map to `coco_class_mapping` are included in the final output. | |