Architect (YOLOv8m)

Architect is a fine-tuned YOLOv8m model for architectural symbol spotting in rasterized floor plans and CAD drawings. Developed as part of the Architecture-RAG project, it empowers multimodal systems to understand structured architectural content.

Model Summary

Base Model: YOLOv8m (pretrained on COCO)
Task: Object detection (28 architectural object categories)
Dataset: FloorPlanCAD
Performance:
- mAP50-95(B): 0.80797
- mAP50(B): 0.87664

✅ Supported Classes (28)

{ 'single door': 0, 'double door': 1, 'sliding door': 2, 'window': 3, 'bay window': 4, 'blind window': 5, 'opening symbol': 6, 'stair': 7, 'gas stove': 8, 'refrigerator': 9, 'washing machine': 10, 'sofa': 11, 'bed': 12, 'chair': 13, 'table': 14, 'bedside cupboard': 15, 'TV cabinet': 16, 'half-height cabinet': 17, 'high cabinet': 18, 'wardrobe': 19, 'sink': 20, 'bath': 21, 'bath tub': 22, 'squat toilet': 23, 'urinal': 24, 'toilet': 25, 'elevator': 26, 'escalator': 27 }

🧪 How to Use

from ultralytics import YOLO
from PIL import Image

# Load the model from Hugging Face Hub
model = YOLO('SamirShabani/Architect')

# Run inference on a local image file
results = model('path/to/image.png')

# Optionally, run inference on a PIL Image
# image = Image.open('path/to/image.png')
# results = model(image)[0]

# Print detection results
for r in results:
    for box in r.boxes:
        class_id = int(box.cls[0])
        class_name = model.names[class_id]
        confidence = float(box.conf[0])
        bbox = box.xyxy[0].tolist()
        print(f"Detected: {class_name}, Confidence: {confidence:.2f}, BBox: {bbox}")

# Save output image with drawn bounding boxes
results[0].save(filename="prediction_output.jpg")

🛠️ Training Details

Framework: Ultralytics YOLOv8
Pretrained Model: yolov8m.pt
Training Hardware: NVIDIA Tesla P100 / T4 (Kaggle)
Epochs: 100 (early stopping patience=20)
Image Size: 640 × 640
Batch Size: 16
Optimizer: AdamW
Scheduler: Cosine Annealing

📦 Dataset

Source: FloorPlanCAD (https://floorplancad.github.io/)
Images: 15,285 SVG drawings → converted to 640×640 PNG images
Labeled Samples: ~11,35 images with bounding box annotations
License: CC BY-NC 4.0 (https://creativecommons.org/licenses/by-nc/4.0/)
Non-commercial use only

📊 Evaluation Metrics (Epoch 54)

Metric	Value	Description
metrics/mAP50-95(B)	0.80797	Mean Average Precision [IoU = 0.50 to 0.95]
metrics/mAP50(B)	0.87664	Mean Average Precision at IoU = 0.50
train/box_loss	0.4671	Localization loss on training set
val/box_loss	0.32854	Localization loss on validation set
train/cls_loss	0.81329	Classification loss on training set
val/cls_loss	0.57334	Classification loss on validation set

Training and validation curves are available in the results.png generated during training.

⚠️ Known Limitations

Symbol Bias: Frequent objects like doors and windows dominate the training samples.
Centering Bias: Objects are mostly centered in cropped training patches.
Text Ignorance: The model does not interpret text or annotations near symbols.
"Stuff" Categories Ignored: The model does not detect background elements like walls or parking spaces.
Low-Quality Documents: Performance may degrade on scanned or low-resolution plans with noise.

📚 Citation

@InProceedings{Fan_2021_ICCV,
  author    = {Fan, Zhiwen and Zhu, Lingjie and Li, Honghua and Zhu, Siyu and Tan, Ping},
  title     = {FloorPlanCAD: A Large-Scale CAD Drawing Dataset for Panoptic Symbol},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  month     = {October},
  year      = {2021}
}

👤 Creator

Samir Shabani
Machine Learning Engineer | Student

LinkedIn: https://www.linkedin.com/in/samir-shabani
GitHub: https://github.com/Sam1rShaban1

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for SamirShabani/Architect

Base model

Ultralytics/YOLOv8

Finetuned

(125)

this model