Instructions to use JDONE-Research/LandViT-DPT-330m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use JDONE-Research/LandViT-DPT-330m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-segmentation", model="JDONE-Research/LandViT-DPT-330m")# Load model directly from transformers import LandViT model = LandViT.from_pretrained("JDONE-Research/LandViT-DPT-330m", dtype="auto") - Notebooks
- Google Colab
- Kaggle
LandViT-DPT-330m
Land Cover Vision Transformer for Semantic Segmentation
์ด ๋ชจ๋ธ์ ํญ๊ณต ์์์์ ํ ์ง ํผ๋ณต์ 11๊ฐ ํด๋์ค๋ก ๋ถ๋ฅํ๋ Semantic Segmentation ๋ชจ๋ธ์ ๋๋ค.
๋ชจ๋ธ ์ ๋ณด
- ๋ชจ๋ธ ๊ตฌ์กฐ: Vision Transformer (ViT) + DPT-style Decoder
- ์ ๋ ฅ ํด์๋: 512ร512ร3
- ํด๋์ค ์: 11 (๋ฐฐ๊ฒฝ ํฌํจ)
- ํ๋ ์์ํฌ: PyTorch + Hugging Face Transformers
์ฑ๋ฅ ์งํ
| ์งํ | ๊ฐ |
|---|---|
| mIoU | (์ถ๊ฐ ์์ ) |
ํด๋์ค ์ ๋ณด
์ฌ์ฉ ๋ฐฉ๋ฒ
์ค์น
pip install torch torchvision transformers pillow numpy
๊ธฐ๋ณธ ์ถ๋ก
import torch
from PIL import Image
from transformers import AutoModelForSemanticSegmentation, AutoImageProcessor
# ๋ชจ๋ธ ๋ก๋
model = AutoModelForSemanticSegmentation.from_pretrained(
"JDONE-Research/LandViT-DPT-330m",
trust_remote_code=True
)
processor = AutoImageProcessor.from_pretrained(
"JDONE-Research/LandViT-DPT-330m",
trust_remote_code=True
)
# ์ด๋ฏธ์ง ๋ก๋
image = Image.open("image.jpg")
# ์ ์ฒ๋ฆฌ
inputs = processor(images=image, return_tensors="pt")
# ์ถ๋ก
with torch.no_grad():
outputs = model(**inputs)
# ํ์ฒ๋ฆฌ
segmentation_map = processor.post_process_semantic_segmentation(
outputs, target_sizes=[(image.height, image.width)]
)[0]
print(f"Segmentation shape: {segmentation_map.shape}")
์๊ฐํ
import numpy as np
import matplotlib.pyplot as plt
# ๋ชจ๋ธ ์ค์ ์์ ํ๋ ํธ ๊ฐ์ ธ์ค๊ธฐ
palette = model.config.label_colors
# ์ปฌ๋ฌ ๋ง์คํฌ ์์ฑ
color_mask = np.zeros((*segmentation_map.shape, 3), dtype=np.uint8)
for class_id, color in enumerate(palette):
color_mask[segmentation_map == class_id] = color
# ์๊ฐํ
fig, axes = plt.subplots(1, 2, figsize=(15, 5))
axes[0].imshow(image)
axes[0].set_title("Original Image")
axes[0].axis("off")
axes[1].imshow(color_mask)
axes[1].set_title("Segmentation Result")
axes[1].axis("off")
plt.tight_layout()
plt.show()
๋ฐฐ์น ์ถ๋ก
# ์ฌ๋ฌ ์ด๋ฏธ์ง ์ฒ๋ฆฌ
images = [Image.open(f"image_{i}.jpg") for i in range(4)]
inputs = processor(images=images, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
segmentation_maps = processor.post_process_semantic_segmentation(
outputs, target_sizes=[(img.height, img.width) for img in images]
)
for i, seg_map in enumerate(segmentation_maps):
print(f"Image {i} segmentation shape: {seg_map.shape}")
์ ํ ์ฌํญ
- ํญ๊ณต ์์์ ์ต์ ํ๋์ด ์์ผ๋ฉฐ, ๋ค๋ฅธ ๋๋ฉ์ธ์ ์์์์๋ ์ฑ๋ฅ์ด ์ ํ๋ ์ ์์ต๋๋ค.
- 512ร512 ํด์๋๋ก ํ์ต๋์ด, ์ด๋ณด๋ค ํฐ ์ด๋ฏธ์ง๋ ํจ์น ๋จ์๋ก ์ฒ๋ฆฌํด์ผ ํฉ๋๋ค.
- ํ๊ตญ ์ง์ญ ๋ฐ์ดํฐ๋ก ํ์ต๋์ด, ๋ค๋ฅธ ์ง์ญ์์๋ fine-tuning์ด ํ์ํ ์ ์์ต๋๋ค.
- ์ผ๋ถ ํด๋์ค(์ฃผ์ฐจ์ฅ, ๋น๋ํ์ฐ์ค ๋ฑ)๋ ๋ฐ์ดํฐ ๋ถ๊ท ํ์ผ๋ก ์ธํด ์๋์ ์ผ๋ก ๋ฎ์ ์ ํ๋๋ฅผ ๋ณด์ผ ์ ์์ต๋๋ค.
Citation
@misc{landvit2026,
title={LandViT: Land Cover Vision Transformer for Semantic Segmentation},
author={JDONE Inc.},
year={2026},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/JDONE-Research/LandViT-DPT-330m}}
}
- Downloads last month
- 2










