Instructions to use aadi100409/best-comic-panel-detection with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ultralytics
How to use aadi100409/best-comic-panel-detection with ultralytics:
from ultralytics import YOLOvv12 model = YOLOvv12.from_pretrained("aadi100409/best-comic-panel-detection") source = 'http://images.cocodataset.org/val2017/000000039769.jpg' model.predict(source=source, save=True) - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| library_name: ultralytics | |
| tags: | |
| - object-detection | |
| - yolo | |
| - yolov12 | |
| - comic-books | |
| - comic | |
| - computer-vision | |
| - ultralytics | |
| - pytorch | |
| widget: | |
| - modelId: mosesb/best-comic-panel-detection | |
| title: YOLOv12 Comic Panel Detection | |
| url: https://huggingface.co/mosesb/best-comic-panel-detection/blob/main/prediction.jpg | |
| datasets: | |
| - Custom-Object-Detection | |
| metrics: | |
| - mAP50 | |
| - mAP50-95 | |
| # YOLOv12 for Comic Panel Detection | |
| This repository contains a **YOLOv12x** object detection model fine-tuned to detect individual panels in comic book pages. The model identifies the bounding boxes for each panel, making it a valuable tool for digitizing comics, extracting content, or building datasets for downstream analysis. | |
| This model was trained in PyTorch using the powerful `ultralytics` library and demonstrates high performance on a custom-annotated dataset of comic pages. | |
| *Visit this space to try out the model right now: [`The_Best_Comic_Panel_Detection`](https://huggingface.co/spaces/mosesb/best-comic-panel-detection).* | |
| ## Model Details | |
| * **Architecture:** `YOLOv12x` (the extra-large variant) | |
| * **Fine-tuned on:** A custom Roboflow dataset named "Custom-Workflow-3-Object-Detection-1". | |
| * **Classes:** `Comic Panel` | |
| * **Frameworks:** PyTorch, Ultralytics | |
| ## How to Get Started | |
| You can easily use this model with the `ultralytics` library. The model file `best.pt` from this repository is required. | |
| ```python | |
| # 1. Install Ultralytics | |
| !pip install ultralytics | |
| from ultralytics import YOLO | |
| from PIL import Image | |
| # 2. Load the fine-tuned model | |
| # Make sure 'best.pt' is in your current directory | |
| model = YOLO('best.pt') | |
| # 3. Run inference on an image | |
| image_path = 'path/to/your/comic_page.jpg' | |
| results = model.predict(source=image_path) | |
| # 4. Process and visualize results | |
| # The 'results' object contains bounding boxes, classes, and confidence scores | |
| for result in results: | |
| # Plotting will draw the bounding boxes on the image | |
| im_array = result.plot() | |
| im = Image.fromarray(im_array[..., ::-1]) # Convert BGR to RGB | |
| im.show() # Display the image | |
| # or | |
| # im.save('prediction_result.jpg') | |
| # You can also access bounding box data directly | |
| for box in results[0].boxes: | |
| print("Class:", model.names[int(box.cls)]) | |
| print("Confidence:", box.conf.item()) | |
| print("Coordinates (xyxy):", box.xyxy[0].tolist()) | |
| print("-" * 20) | |
| ``` | |
| ## Training Procedure | |
| The model was fine-tuned using transfer learning from a YOLOv12x checkpoint pre-trained on the COCO dataset. | |
| ### Training Hyperparameters | |
| * **Image Size:** 640x640 | |
| * **Batch Size:** 16 | |
| * **Optimizer:** AdamW (lr=0.002) | |
| * **Epochs:** 200 | |
| * **Patience:** 100 epochs for early stopping | |
|  | |
| ## Evaluation | |
| The model's performance was evaluated on the validation set during training. The final metrics are based on the checkpoint that achieved the highest **mAP50-95**. | |
| ### Key Performance Metrics | |
| | Metric | Value | Description | | |
| | :---------- | :---- | :--------------------------------------------------- | | |
| | **mAP50** | 0.991 | Mean Average Precision at IoU threshold 0.50. | | |
| | **mAP50-95**| 0.985 | Mean Average Precision averaged over IoU thresholds from 0.50 to 0.95. | | |
| The model achieves near-perfect precision and recall on the validation data, indicating a strong ability to correctly identify comic panels within the styles present in the dataset. | |
|  | |
| ### Qualitative Results | |
| The model correctly identifies panels of various sizes and layouts in the validation set. | |
|  | |
| ## Intended Use and Limitations | |
| This model is intended for applications requiring the segmentation of comic book pages into their constituent panels. This can be a pre-processing step for: | |
| - Creating structured digital reading experiences. | |
| - Extracting text or characters from individual panels. | |
| - Analyzing comic book layouts and artistic styles. | |
| **The model has been tested in real world applications and has shown promising results.** | |
| ### Limitations | |
| * **Non-Rectangular Panels:** The model is trained to detect rectangular bounding boxes and may struggle with highly irregular or overlapping panel shapes. | |
| ## Acknowledgements | |
| * **Ultralytics** for the amazing [YOLOv12 model](https://github.com/ultralytics/ultralytics) and library. | |
| * **Roboflow:** for their dataset hosting platform and **custom-workflow-3-object-detection-g24r5-fmfkb** for compiling and annotating this incredible dataset. | |
| *This model card is based on the training notebook [`YOLOV12-Comic-Panel-Detection`](https://github.com/mosesab/YOLOV12-Comic-Panel-Detection).* |