SamirShabani
/

Architect

Object Detection

Model card Files Files and versions

xet

Community

SamirShabani commited on Jul 4, 2025

Commit

a62cc7c

verified ·

1 Parent(s): 31b834c

Update README.md

Browse files

Files changed (1) hide show

README.md +123 -2

README.md CHANGED Viewed

@@ -2,5 +2,126 @@
 license: cc-by-nc-4.0
 base_model:
 - Ultralytics/YOLOv8
-pipeline_tag: image-classification
----

 license: cc-by-nc-4.0
 base_model:
 - Ultralytics/YOLOv8
+pipeline_tag: object-detection
+---
+# Architect (YOLOv8m)
+`Architect` is a fine-tuned YOLOv8m model for **architectural symbol spotting** in rasterized floor plans and CAD drawings. Developed as part of the `Arch-Intelli-RAG` project, it empowers multimodal systems to understand structured architectural content.
+## Model Summary
+- **Base Model:** YOLOv8m (pretrained on COCO)
+- **Task:** Object detection (28 architectural object categories)
+- **Dataset:** [FloorPlanCAD](https://floorplancad.github.io/)
+- **Performance:**
+  - **mAP50-95(B):** 0.80797
+  - **mAP50(B):** 0.87664
+---
+## ✅ Supported Classes (28)
+{
+  'single door': 0, 'double door': 1, 'sliding door': 2, 'window': 3, 'bay window': 4,
+  'blind window': 5, 'opening symbol': 6, 'stair': 7, 'gas stove': 8, 'refrigerator': 9,
+  'washing machine': 10, 'sofa': 11, 'bed': 12, 'chair': 13, 'table': 14,
+  'bedside cupboard': 15, 'TV cabinet': 16, 'half-height cabinet': 17, 'high cabinet': 18,
+  'wardrobe': 19, 'sink': 20, 'bath': 21, 'bath tub': 22, 'squat toilet': 23, 'urinal': 24,
+  'toilet': 25, 'elevator': 26, 'escalator': 27
+}
+## 🧪 How to Use
+from ultralytics import YOLO
+from PIL import Image
+# Load the model from Hugging Face Hub
+model = YOLO('SamirShabani/Architect')
+# Run inference on a local image file
+results = model('path/to/image.png')
+# Optionally, run inference on a PIL Image
+# image = Image.open('path/to/image.png')
+# results = model(image)[0]
+# Print detection results
+for r in results:
+    for box in r.boxes:
+        class_id = int(box.cls[0])
+        class_name = model.names[class_id]
+        confidence = float(box.conf[0])
+        bbox = box.xyxy[0].tolist()
+        print(f"Detected: {class_name}, Confidence: {confidence:.2f}, BBox: {bbox}")
+# Save output image with drawn bounding boxes
+results[0].save(filename="prediction_output.jpg")
+## 🛠️ Training Details
+- Framework: Ultralytics YOLOv8
+- Pretrained Model: yolov8m.pt
+- Training Hardware: NVIDIA Tesla P100 / T4 (Kaggle)
+- Epochs: 100 (early stopping patience=20)
+- Image Size: 640 × 640
+- Batch Size: 16
+- Optimizer: AdamW
+- Scheduler: Cosine Annealing
+---
+## 📦 Dataset
+- Source: FloorPlanCAD (https://floorplancad.github.io/)
+- Images: 15,285 SVG drawings → converted to 640×640 PNG images
+- Labeled Samples: ~8,000 images with bounding box annotations
+- License: CC BY-NC 4.0 (https://creativecommons.org/licenses/by-nc/4.0/)
+  Non-commercial use only
+---
+## 📊 Evaluation Metrics (Epoch 54)
+Metric               | Value    | Description
+---------------------|----------|-------------------------------------------
+metrics/mAP50-95(B)  | 0.80797  | Mean Average Precision [IoU = 0.50 to 0.95]
+metrics/mAP50(B)     | 0.87664  | Mean Average Precision at IoU = 0.50
+train/box_loss       | 0.4671   | Localization loss on training set
+val/box_loss         | 0.32854  | Localization loss on validation set
+train/cls_loss       | 0.81329  | Classification loss on training set
+val/cls_loss         | 0.57334  | Classification loss on validation set
+Training and validation curves are available in the results.png generated during training.
+---
+## ⚠️ Known Limitations
+- Symbol Bias: Frequent objects like doors and windows dominate the training samples.
+- Centering Bias: Objects are mostly centered in cropped training patches.
+- Text Ignorance: The model does **not** interpret text or annotations near symbols.
+- "Stuff" Categories Ignored: The model does **not** detect background elements like walls or parking spaces.
+- Low-Quality Documents: Performance may degrade on scanned or low-resolution plans with noise.
+---
+## 📚 Citation
+If you use this model or dataset, please cite the original FloorPlanCAD paper:
+@InProceedings{Fan_2021_ICCV,
+  author = {Fan, Zhiwen and Zhu, Lingjie and Li, Honghua and Zhu, Siyu and Tan, Ping},
+  title = {FloorPlanCAD: A Large-Scale CAD Drawing Dataset for Panoptic Symbol},
+  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
+  month = {October},
+  year = {2021}
+}
+## 👤 Creator
+Samir Shabani
+Machine Learning Engineer | Final Year Capstone Project
+LinkedIn: https://www.linkedin.com/in/samir-shabani
+GitHub: https://github.com/Sam1rShaban1