mosesb
/

best-comic-panel-detection

@@ -1,123 +1,125 @@
----
-license: apache-2.0
-library_name: ultralytics
-tags:
-- object-detection
-- yolo
-- yolov12
-- comic-books
-- comic
-- computer-vision
-- ultralytics
-- pytorch
-widget:
-- modelId: mosesb/best-comic-panel-detection
-  title: YOLOv12 Comic Panel Detection
-  url: https://huggingface.co/mosesb/best-comic-panel-detection/blob/main/prediction.jpg
-datasets:
-- Custom-Object-Detection
-metrics:
-- mAP50
-- mAP50-95
----
-# YOLOv12 for Comic Panel Detection
-This repository contains a **YOLOv12x** object detection model fine-tuned to detect individual panels in comic book pages. The model identifies the bounding boxes for each panel, making it a valuable tool for digitizing comics, extracting content, or building datasets for downstream analysis.
-This model was trained in PyTorch using the powerful `ultralytics` library and demonstrates high performance on a custom-annotated dataset of comic pages.
-## Model Details
-*   **Architecture:** `YOLOv12x` (the extra-large variant)
-*   **Fine-tuned on:** A custom Roboflow dataset named "Custom-Workflow-3-Object-Detection-1".
-*   **Classes:** `Comic Panel`
-*   **Frameworks:** PyTorch, Ultralytics
-## How to Get Started
-You can easily use this model with the `ultralytics` library. The model file `best.pt` from this repository is required.
-```python
-# 1. Install Ultralytics
-!pip install ultralytics
-from ultralytics import YOLO
-from PIL import Image
-# 2. Load the fine-tuned model
-# Make sure 'best.pt' is in your current directory
-model = YOLO('best.pt')
-# 3. Run inference on an image
-image_path = 'path/to/your/comic_page.jpg'
-results = model.predict(source=image_path)
-# 4. Process and visualize results
-# The 'results' object contains bounding boxes, classes, and confidence scores
-for result in results:
-    # Plotting will draw the bounding boxes on the image
-    im_array = result.plot()
-    im = Image.fromarray(im_array[..., ::-1]) # Convert BGR to RGB
-    im.show() # Display the image
-    # or
-    # im.save('prediction_result.jpg')
-# You can also access bounding box data directly
-for box in results[0].boxes:
-    print("Class:", model.names[int(box.cls)])
-    print("Confidence:", box.conf.item())
-    print("Coordinates (xyxy):", box.xyxy[0].tolist())
-    print("-" * 20)
-```
-## Training Procedure
-The model was fine-tuned using transfer learning from a YOLOv12x checkpoint pre-trained on the COCO dataset.
-### Training Hyperparameters
-*   **Image Size:** 640x640
-*   **Batch Size:** 16
-*   **Optimizer:** AdamW (lr=0.002)
-*   **Epochs:** 200
-*   **Patience:** 100 epochs for early stopping
-![Training and Validation Metrics](results.png)
-## Evaluation
-The model's performance was evaluated on the validation set during training. The final metrics are based on the checkpoint that achieved the highest **mAP50-95**.
-### Key Performance Metrics
-| Metric      | Value | Description                                          |
-| :---------- | :---- | :--------------------------------------------------- |
-| **mAP50**   | 0.991 | Mean Average Precision at IoU threshold 0.50.        |
-| **mAP50-95**| 0.985 | Mean Average Precision averaged over IoU thresholds from 0.50 to 0.95. |
-The model achieves near-perfect precision and recall on the validation data, indicating a strong ability to correctly identify comic panels within the styles present in the dataset.
-![Confusion Matrix](confusion_matrix.png)
-### Qualitative Results
-The model correctly identifies panels of various sizes and layouts in the validation set.
-![Validation Predictions](val_batch0_pred.jpg)
-## Intended Use and Limitations
-This model is intended for applications requiring the segmentation of comic book pages into their constituent panels. This can be a pre-processing step for:
--   Creating structured digital reading experiences.
--   Extracting text or characters from individual panels.
--   Analyzing comic book layouts and artistic styles.
-**The model has been tested in real world applications and has shown promising results.**
-### Limitations
-*   **Non-Rectangular Panels:** The model is trained to detect rectangular bounding boxes and may struggle with highly irregular or overlapping panel shapes.
-## Acknowledgements
-*   **Ultralytics** for the amazing [YOLOv12 model](https://github.com/ultralytics/ultralytics) and library.
-*   **Roboflow:** for their dataset hosting platform and **custom-workflow-3-object-detection-g24r5-fmfkb** for compiling and annotating this incredible dataset.
 *This model card is based on the training notebook [`YOLOV12-Comic-Panel-Detection`](https://github.com/mosesab/YOLOV12-Comic-Panel-Detection).*

+---
+license: apache-2.0
+library_name: ultralytics
+tags:
+- object-detection
+- yolo
+- yolov12
+- comic-books
+- comic
+- computer-vision
+- ultralytics
+- pytorch
+widget:
+- modelId: mosesb/best-comic-panel-detection
+  title: YOLOv12 Comic Panel Detection
+  url: https://huggingface.co/mosesb/best-comic-panel-detection/blob/main/prediction.jpg
+datasets:
+- Custom-Object-Detection
+metrics:
+- mAP50
+- mAP50-95
+---
+# YOLOv12 for Comic Panel Detection
+This repository contains a **YOLOv12x** object detection model fine-tuned to detect individual panels in comic book pages. The model identifies the bounding boxes for each panel, making it a valuable tool for digitizing comics, extracting content, or building datasets for downstream analysis.
+This model was trained in PyTorch using the powerful `ultralytics` library and demonstrates high performance on a custom-annotated dataset of comic pages.
+*Visit this space to try out the model right now: [`The_Best_Comic_Panel_Detection`](https://huggingface.co/spaces/mosesb/best-comic-panel-detection).*
+## Model Details
+*   **Architecture:** `YOLOv12x` (the extra-large variant)
+*   **Fine-tuned on:** A custom Roboflow dataset named "Custom-Workflow-3-Object-Detection-1".
+*   **Classes:** `Comic Panel`
+*   **Frameworks:** PyTorch, Ultralytics
+## How to Get Started
+You can easily use this model with the `ultralytics` library. The model file `best.pt` from this repository is required.
+```python
+# 1. Install Ultralytics
+!pip install ultralytics
+from ultralytics import YOLO
+from PIL import Image
+# 2. Load the fine-tuned model
+# Make sure 'best.pt' is in your current directory
+model = YOLO('best.pt')
+# 3. Run inference on an image
+image_path = 'path/to/your/comic_page.jpg'
+results = model.predict(source=image_path)
+# 4. Process and visualize results
+# The 'results' object contains bounding boxes, classes, and confidence scores
+for result in results:
+    # Plotting will draw the bounding boxes on the image
+    im_array = result.plot()
+    im = Image.fromarray(im_array[..., ::-1]) # Convert BGR to RGB
+    im.show() # Display the image
+    # or
+    # im.save('prediction_result.jpg')
+# You can also access bounding box data directly
+for box in results[0].boxes:
+    print("Class:", model.names[int(box.cls)])
+    print("Confidence:", box.conf.item())
+    print("Coordinates (xyxy):", box.xyxy[0].tolist())
+    print("-" * 20)
+```
+## Training Procedure
+The model was fine-tuned using transfer learning from a YOLOv12x checkpoint pre-trained on the COCO dataset.
+### Training Hyperparameters
+*   **Image Size:** 640x640
+*   **Batch Size:** 16
+*   **Optimizer:** AdamW (lr=0.002)
+*   **Epochs:** 200
+*   **Patience:** 100 epochs for early stopping
+![Training and Validation Metrics](results.png)
+## Evaluation
+The model's performance was evaluated on the validation set during training. The final metrics are based on the checkpoint that achieved the highest **mAP50-95**.
+### Key Performance Metrics
+| Metric      | Value | Description                                          |
+| :---------- | :---- | :--------------------------------------------------- |
+| **mAP50**   | 0.991 | Mean Average Precision at IoU threshold 0.50.        |
+| **mAP50-95**| 0.985 | Mean Average Precision averaged over IoU thresholds from 0.50 to 0.95. |
+The model achieves near-perfect precision and recall on the validation data, indicating a strong ability to correctly identify comic panels within the styles present in the dataset.
+![Confusion Matrix](confusion_matrix.png)
+### Qualitative Results
+The model correctly identifies panels of various sizes and layouts in the validation set.
+![Validation Predictions](val_batch0_pred.jpg)
+## Intended Use and Limitations
+This model is intended for applications requiring the segmentation of comic book pages into their constituent panels. This can be a pre-processing step for:
+-   Creating structured digital reading experiences.
+-   Extracting text or characters from individual panels.
+-   Analyzing comic book layouts and artistic styles.
+**The model has been tested in real world applications and has shown promising results.**
+### Limitations
+*   **Non-Rectangular Panels:** The model is trained to detect rectangular bounding boxes and may struggle with highly irregular or overlapping panel shapes.
+## Acknowledgements
+*   **Ultralytics** for the amazing [YOLOv12 model](https://github.com/ultralytics/ultralytics) and library.
+*   **Roboflow:** for their dataset hosting platform and **custom-workflow-3-object-detection-g24r5-fmfkb** for compiling and annotating this incredible dataset.
 *This model card is based on the training notebook [`YOLOV12-Comic-Panel-Detection`](https://github.com/mosesab/YOLOV12-Comic-Panel-Detection).*