lsmpp's picture
Add files using upload-large-folder tool
4cef5ec verified

ํ‚คํฌ์ธํŠธ ํƒ์ง€ [[keypoint-detection]]

[[open-in-colab]]

ํ‚คํฌ์ธํŠธ ๊ฐ์ง€(Keypoint detection)์€ ์ด๋ฏธ์ง€ ๋‚ด์˜ ํŠน์ • ํฌ์ธํŠธ๋ฅผ ์‹๋ณ„ํ•˜๊ณ  ์œ„์น˜๋ฅผ ํƒ์ง€ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ‚คํฌ์ธํŠธ๋Š” ๋žœ๋“œ๋งˆํฌ๋ผ๊ณ ๋„ ๋ถˆ๋ฆฌ๋ฉฐ ์–ผ๊ตด ํŠน์ง•์ด๋‚˜ ๋ฌผ์ฒด์˜ ์ผ๋ถ€์™€ ๊ฐ™์€ ์˜๋ฏธ ์žˆ๋Š” ํŠน์ง•์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ํ‚คํฌ์ธํŠธ ๊ฐ์ง€ ๋ชจ๋ธ๋“ค์€ ์ด๋ฏธ์ง€๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ์•„๋ž˜์™€ ๊ฐ™์€ ์ถœ๋ ฅ์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

  • ํ‚คํฌ์ธํŠธ๋“ค๊ณผ ์ ์ˆ˜: ๊ด€์‹ฌ ํฌ์ธํŠธ๋“ค๊ณผ ํ•ด๋‹น ํฌ์ธํŠธ์— ๋Œ€ํ•œ ์‹ ๋ขฐ๋„ ์ ์ˆ˜
  • ๋””์Šคํฌ๋ฆฝํ„ฐ(Descriptors): ๊ฐ ํ‚คํฌ์ธํŠธ๋ฅผ ๋‘˜๋Ÿฌ์‹ผ ์ด๋ฏธ์ง€ ์˜์—ญ์˜ ํ‘œํ˜„์œผ๋กœ ํ…์Šค์ฒ˜, ๊ทธ๋ผ๋ฐ์ด์…˜, ๋ฐฉํ–ฅ ๋ฐ ๊ธฐํƒ€ ์†์„ฑ์„ ์บก์ฒ˜ํ•ฉ๋‹ˆ๋‹ค.

์ด๋ฒˆ ๊ฐ€์ด๋“œ์—์„œ๋Š” ์ด๋ฏธ์ง€์—์„œ ํ‚คํฌ์ธํŠธ๋ฅผ ์ถ”์ถœํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋‹ค๋ฃจ์–ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

์ด๋ฒˆ ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” ํ‚คํฌ์ธํŠธ ๊ฐ์ง€์˜ ๊ธฐ๋ณธ์ด ๋˜๋Š” ๋ชจ๋ธ์ธ SuperPoint๋ฅผ ์‚ฌ์šฉํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

from transformers import AutoImageProcessor, SuperPointForKeypointDetection
processor = AutoImageProcessor.from_pretrained("magic-leap-community/superpoint")
model = SuperPointForKeypointDetection.from_pretrained("magic-leap-community/superpoint")

์•„๋ž˜์˜ ์ด๋ฏธ์ง€๋กœ ๋ชจ๋ธ์„ ํ…Œ์ŠคํŠธ ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

Bee Cats
import torch
from PIL import Image
import requests
import cv2


url_image_1 = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg"
image_1 = Image.open(requests.get(url_image_1, stream=True).raw)
url_image_2 = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/cats.png"
image_2 = Image.open(requests.get(url_image_2, stream=True).raw)

images = [image_1, image_2]

์ด์ œ ์ž…๋ ฅ์„ ์ฒ˜๋ฆฌํ•˜๊ณ  ์ถ”๋ก ์„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

inputs = processor(images,return_tensors="pt").to(model.device, model.dtype)
outputs = model(**inputs)

๋ชจ๋ธ ์ถœ๋ ฅ์—๋Š” ๋ฐฐ์น˜ ๋‚ด์˜ ๊ฐ ํ•ญ๋ชฉ์— ๋Œ€ํ•œ ์ƒ๋Œ€์ ์ธ ํ‚คํฌ์ธํŠธ, ๋””์Šคํฌ๋ฆฝํ„ฐ, ๋งˆ์Šคํฌ์™€ ์ ์ˆ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๋งˆ์Šคํฌ๋Š” ์ด๋ฏธ์ง€์—์„œ ํ‚คํฌ์ธํŠธ๊ฐ€ ์žˆ๋Š” ์˜์—ญ์„ ๊ฐ•์กฐํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.

SuperPointKeypointDescriptionOutput(loss=None, keypoints=tensor([[[0.0437, 0.0167],
         [0.0688, 0.0167],
         [0.0172, 0.0188],
         ...,
         [0.5984, 0.9812],
         [0.6953, 0.9812]]]), 
         scores=tensor([[0.0056, 0.0053, 0.0079,  ..., 0.0125, 0.0539, 0.0377],
        [0.0206, 0.0058, 0.0065,  ..., 0.0000, 0.0000, 0.0000]],
       grad_fn=<CopySlices>), descriptors=tensor([[[-0.0807,  0.0114, -0.1210,  ..., -0.1122,  0.0899,  0.0357],
         [-0.0807,  0.0114, -0.1210,  ..., -0.1122,  0.0899,  0.0357],
         [-0.0807,  0.0114, -0.1210,  ..., -0.1122,  0.0899,  0.0357],
         ...],
       grad_fn=<CopySlices>), mask=tensor([[1, 1, 1,  ..., 1, 1, 1],
        [1, 1, 1,  ..., 0, 0, 0]], dtype=torch.int32), hidden_states=None)

์ด๋ฏธ์ง€์— ์‹ค์ œ ํ‚คํฌ์ธํŠธ๋ฅผ ํ‘œ์‹œํ•˜๊ธฐ ์œ„ํ•ด์„  ๊ฒฐ๊ณผ๊ฐ’์„ ํ›„์ฒ˜๋ฆฌ ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ์‹ค์ œ ์ด๋ฏธ์ง€ ํฌ๊ธฐ๋ฅผ ๊ฒฐ๊ณผ๊ฐ’๊ณผ ํ•จ๊ป˜ post_process_keypoint_detection์— ์ „๋‹ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

image_sizes = [(image.size[1], image.size[0]) for image in images]
outputs = processor.post_process_keypoint_detection(outputs, image_sizes)

์œ„ ์ฝ”๋“œ๋ฅผ ํ†ตํ•ด ๊ฒฐ๊ณผ๊ฐ’์€ ๋”•์…”๋„ˆ๋ฆฌ๋ฅผ ๊ฐ–๋Š” ๋ฆฌ์ŠคํŠธ๊ฐ€ ๋˜๊ณ , ๊ฐ ๋”•์…”๋„ˆ๋ฆฌ๋“ค์€ ํ›„์ฒ˜๋ฆฌ๋œ ํ‚คํฌ์ธํŠธ, ์ ์ˆ˜ ๋ฐ ๋””์Šคํฌ๋ฆฝํ„ฐ๋กœ ์ด๋ฃจ์–ด์ ธ์žˆ์Šต๋‹ˆ๋‹ค.

[{'keypoints': tensor([[ 226,   57],
          [ 356,   57],
          [  89,   64],
          ...,
          [3604, 3391]], dtype=torch.int32),
  'scores': tensor([0.0056, 0.0053, ...], grad_fn=<IndexBackward0>),
  'descriptors': tensor([[-0.0807,  0.0114, -0.1210,  ..., -0.1122,  0.0899,  0.0357],
          [-0.0807,  0.0114, -0.1210,  ..., -0.1122,  0.0899,  0.0357]],
         grad_fn=<IndexBackward0>)},
    {'keypoints': tensor([[ 46,   6],
          [ 78,   6],
          [422,   6],
          [206, 404]], dtype=torch.int32),
  'scores': tensor([0.0206, 0.0058, 0.0065, 0.0053, 0.0070, ...,grad_fn=<IndexBackward0>),
  'descriptors': tensor([[-0.0525,  0.0726,  0.0270,  ...,  0.0389, -0.0189, -0.0211],
          [-0.0525,  0.0726,  0.0270,  ...,  0.0389, -0.0189, -0.0211]}]

์ด์ œ ์œ„ ๋”•์…”๋„ˆ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ‚คํฌ์ธํŠธ๋ฅผ ํ‘œ์‹œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

import matplotlib.pyplot as plt
import torch

for i in range(len(images)):
  keypoints = outputs[i]["keypoints"]
  scores = outputs[i]["scores"]
  descriptors = outputs[i]["descriptors"]
  keypoints = outputs[i]["keypoints"].detach().numpy()
  scores = outputs[i]["scores"].detach().numpy()
  image = images[i]
  image_width, image_height = image.size

  plt.axis('off')
  plt.imshow(image)
  plt.scatter(
      keypoints[:, 0],
      keypoints[:, 1],
      s=scores * 100,
      c='cyan',
      alpha=0.4
  )
  plt.show()

์•„๋ž˜์—์„œ ๊ฒฐ๊ณผ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Bee Cats