KoalaSeg / README.md

Update README.md

2454881 verified 8 months ago

4.8 kB

	---
	library_name: transformers
	tags:
	- vision
	- image-segmentation
	- universal-segmentation
	- korean-road
	- oneformer
	- distillation
	- aihub
	model_name: koalaseg
	---

	# KoalaSeg 🐨🛣️

	## Colab Inference :
	[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1LXWqtv-7lba128iEzhgSwXtEzRpRF7I0?usp=sharing)

	_KOrean lAyered assistive Segmentation_

	![Inference Demo](./overlay_ft_20250621_130747.png)

	한국 도로·보행 환경 전용 Universal Segmentation 모델입니다.
	`shi-labs/oneformer_cityscapes_swin_large` 기반 OneFormer 교사 모델을
	1. 수작업 XML 폴리곤
	2. AIHUB 도로·보행환경 Surface Mask(5k) + Polygon(500) 데이터로 학습한 한국형 모델
	3. Cityscapes 마스크
	순으로 레이어드 앙상블하여 생성한 GT로 Edge-ViT 20 M 학생 모델을 증류했습니다.

	---

	## Model Details

	- Developed by: Team RoadSight
	- Base model: `shi-labs/oneformer_cityscapes_swin_large`
	- Model type: Edge-ViT 20 M + OneFormer head (semantic task)
	- Framework: 🤗 Transformers & PyTorch

	---

	## Training Data

	AIHUB 인도·보행환경 데이터 (https://aihub.or.kr/aihubdata/data/view.do?dataSetSn=189):

	- Bounding Box: 350,000장 (29종 장애물 박스 어노테이션)
	- Polygon: 100,000장 (29종 장애물 폴리곤 어노테이션) → 500장 사용
	- Surface Masking: 50,000장 (노면 상태 마스크) → 5,000장 사용
	- Depth Prediction: 170,000장 (스테레오 깊이)

	총 18,369장 (AIHUB 5.5k + 자가 촬영 9k + Street View 3.7k) 레이어 앙상블 →
	Morph Open/Close + MedianBlur(17px) 후 GT 생성.

	---

	## Speeds & Sizes (512×512, batch=1)

	\| Device \| Baseline Cityscapes \| Ensemble (3-layer) \| Custom (K-Road) \| koalaseg \|
	\|-----------------------\|---------------------\|--------------------\|-----------------\|--------------------\|
	\| A100 \| 3.58 s → 0.28 FPS \| 3.74 s → 0.27 FPS \| 0.15 s → 6.67 FPS \| 0.14 s → 7.25 FPS \|
	\| T4 \| 5.61 s → 0.18 FPS \| 6.01 s → 0.17 FPS \| 0.39 s → 2.60 FPS \| 0.31 s → 3.27 FPS \|
	\| CPU (i9-12900K) \| 124 s → 0.008 FPS \| 150 s → 0.007 FPS \| 26.6 s → 0.038 FPS \| 18.4 s → 0.054 FPS \|

	---

	## Quick Start
	```python
	from transformers import AutoProcessor, AutoModelForUniversalSegmentation
	import torch, requests, matplotlib.pyplot as plt, numpy as np
	from PIL import Image
	from io import BytesIO

	# 0. Load model & processor -----------------------------------
	model_id = "gj5520/KoalaSeg"
	proc = AutoProcessor.from_pretrained(model_id)
	model = AutoModelForUniversalSegmentation.from_pretrained(model_id).to("cuda").eval()

	# 1. Download image -------------------------------------------
	url = "https://pds.joongang.co.kr/news/component/htmlphoto_mmdata/202205/21/1200738c-61c0-4a51-83c4-331f53d4dcdc.jpg"
	resp = requests.get(url, stream=True)
	img = Image.open(BytesIO(resp.content)).convert("RGB")

	# 2. Pre-process & inference ----------------------------------
	inputs = proc(images=img, task_inputs=["semantic"], return_tensors="pt").to("cuda")
	with torch.no_grad():
	out = model(**inputs)

	# 3-A. Get class-id map ---------------------------------------
	idmap = proc.post_process_semantic_segmentation(
	out, target_sizes=[img.size[::-1]]
	)[0].cpu().numpy()

	# 3-B. Convert idmap → RGB mask + overlay ---------------------
	cmap = plt.get_cmap("tab20", max(20, len(np.unique(idmap))))
	mask_rgb = np.zeros((*idmap.shape, 3), dtype=np.uint8)
	for idx, cid in enumerate(np.unique(idmap)):
	if cid == 0: # keep background black
	continue
	mask_rgb[idmap == cid] = (np.array(cmap(idx)[:3]) * 255).astype(np.uint8)

	mask_img = Image.fromarray(mask_rgb)
	overlay = Image.blend(img, mask_img, alpha=0.6) # 0.6 → mask 강조

	# 4. Show overlay ---------------------------------------------
	plt.figure(figsize=(8, 8))
	plt.imshow(overlay)
	plt.axis("off")
	plt.show()
	```


	## Intended Uses
	- 시각 장애인 대상 도로 세그멘테이션
	- 한국 HD 맵·도로 유지보수 지원
	- 학술·연구 목적의 한국형 데이터셋 벤치마크

	### Out-of-Scope
	- 의료·위성·실내 등 비도로 도메인
	- 개인 식별·감시 등 민감 작업

	---

	## Limitations & Risks
	- 한국 도로 전용: 해외·극저조도·폭우 등 환경에서 성능 저하
	- 부분 가림 인체 감지 불안정 → 보조용으로만 사용

	---

	## Citation
	@misc{KoalaSeg2025,
	title = {KoalaSeg: Layered Distillation for Korean Road Universal Segmentation},
	author = {RoadSight Team},
	year = {2025},
	url = {https://huggingface.co/gj5520/KoalaSeg}
	}