SamirShabani commited on
Commit
a62cc7c
·
verified ·
1 Parent(s): 31b834c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +123 -2
README.md CHANGED
@@ -2,5 +2,126 @@
2
  license: cc-by-nc-4.0
3
  base_model:
4
  - Ultralytics/YOLOv8
5
- pipeline_tag: image-classification
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: cc-by-nc-4.0
3
  base_model:
4
  - Ultralytics/YOLOv8
5
+ pipeline_tag: object-detection
6
+ ---
7
+
8
+ # Architect (YOLOv8m)
9
+
10
+ `Architect` is a fine-tuned YOLOv8m model for **architectural symbol spotting** in rasterized floor plans and CAD drawings. Developed as part of the `Arch-Intelli-RAG` project, it empowers multimodal systems to understand structured architectural content.
11
+
12
+ ## Model Summary
13
+
14
+ - **Base Model:** YOLOv8m (pretrained on COCO)
15
+ - **Task:** Object detection (28 architectural object categories)
16
+ - **Dataset:** [FloorPlanCAD](https://floorplancad.github.io/)
17
+ - **Performance:**
18
+ - **mAP50-95(B):** 0.80797
19
+ - **mAP50(B):** 0.87664
20
+
21
+ ---
22
+
23
+ ## ✅ Supported Classes (28)
24
+
25
+ {
26
+ 'single door': 0, 'double door': 1, 'sliding door': 2, 'window': 3, 'bay window': 4,
27
+ 'blind window': 5, 'opening symbol': 6, 'stair': 7, 'gas stove': 8, 'refrigerator': 9,
28
+ 'washing machine': 10, 'sofa': 11, 'bed': 12, 'chair': 13, 'table': 14,
29
+ 'bedside cupboard': 15, 'TV cabinet': 16, 'half-height cabinet': 17, 'high cabinet': 18,
30
+ 'wardrobe': 19, 'sink': 20, 'bath': 21, 'bath tub': 22, 'squat toilet': 23, 'urinal': 24,
31
+ 'toilet': 25, 'elevator': 26, 'escalator': 27
32
+ }
33
+
34
+ ## 🧪 How to Use
35
+
36
+ from ultralytics import YOLO
37
+ from PIL import Image
38
+
39
+ # Load the model from Hugging Face Hub
40
+ model = YOLO('SamirShabani/Architect')
41
+
42
+ # Run inference on a local image file
43
+ results = model('path/to/image.png')
44
+
45
+ # Optionally, run inference on a PIL Image
46
+ # image = Image.open('path/to/image.png')
47
+ # results = model(image)[0]
48
+
49
+ # Print detection results
50
+ for r in results:
51
+ for box in r.boxes:
52
+ class_id = int(box.cls[0])
53
+ class_name = model.names[class_id]
54
+ confidence = float(box.conf[0])
55
+ bbox = box.xyxy[0].tolist()
56
+ print(f"Detected: {class_name}, Confidence: {confidence:.2f}, BBox: {bbox}")
57
+
58
+ # Save output image with drawn bounding boxes
59
+ results[0].save(filename="prediction_output.jpg")
60
+
61
+ ## 🛠️ Training Details
62
+
63
+ - Framework: Ultralytics YOLOv8
64
+ - Pretrained Model: yolov8m.pt
65
+ - Training Hardware: NVIDIA Tesla P100 / T4 (Kaggle)
66
+ - Epochs: 100 (early stopping patience=20)
67
+ - Image Size: 640 × 640
68
+ - Batch Size: 16
69
+ - Optimizer: AdamW
70
+ - Scheduler: Cosine Annealing
71
+
72
+ ---
73
+
74
+ ## 📦 Dataset
75
+
76
+ - Source: FloorPlanCAD (https://floorplancad.github.io/)
77
+ - Images: 15,285 SVG drawings → converted to 640×640 PNG images
78
+ - Labeled Samples: ~8,000 images with bounding box annotations
79
+ - License: CC BY-NC 4.0 (https://creativecommons.org/licenses/by-nc/4.0/)
80
+ Non-commercial use only
81
+
82
+ ---
83
+
84
+ ## 📊 Evaluation Metrics (Epoch 54)
85
+
86
+ Metric | Value | Description
87
+ ---------------------|----------|-------------------------------------------
88
+ metrics/mAP50-95(B) | 0.80797 | Mean Average Precision [IoU = 0.50 to 0.95]
89
+ metrics/mAP50(B) | 0.87664 | Mean Average Precision at IoU = 0.50
90
+ train/box_loss | 0.4671 | Localization loss on training set
91
+ val/box_loss | 0.32854 | Localization loss on validation set
92
+ train/cls_loss | 0.81329 | Classification loss on training set
93
+ val/cls_loss | 0.57334 | Classification loss on validation set
94
+
95
+ Training and validation curves are available in the results.png generated during training.
96
+
97
+ ---
98
+
99
+ ## ⚠️ Known Limitations
100
+
101
+ - Symbol Bias: Frequent objects like doors and windows dominate the training samples.
102
+ - Centering Bias: Objects are mostly centered in cropped training patches.
103
+ - Text Ignorance: The model does **not** interpret text or annotations near symbols.
104
+ - "Stuff" Categories Ignored: The model does **not** detect background elements like walls or parking spaces.
105
+ - Low-Quality Documents: Performance may degrade on scanned or low-resolution plans with noise.
106
+
107
+ ---
108
+
109
+ ## 📚 Citation
110
+
111
+ If you use this model or dataset, please cite the original FloorPlanCAD paper:
112
+
113
+ @InProceedings{Fan_2021_ICCV,
114
+ author = {Fan, Zhiwen and Zhu, Lingjie and Li, Honghua and Zhu, Siyu and Tan, Ping},
115
+ title = {FloorPlanCAD: A Large-Scale CAD Drawing Dataset for Panoptic Symbol},
116
+ booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
117
+ month = {October},
118
+ year = {2021}
119
+ }
120
+
121
+ ## 👤 Creator
122
+
123
+ Samir Shabani
124
+ Machine Learning Engineer | Final Year Capstone Project
125
+
126
+ LinkedIn: https://www.linkedin.com/in/samir-shabani
127
+ GitHub: https://github.com/Sam1rShaban1