DrDavis's picture
Upload folder using huggingface_hub
17c6d62 verified

Image-to-Image μž‘μ—… κ°€μ΄λ“œ [[image-to-image-task-guide]]

[[open-in-colab]]

Image-to-Image μž‘μ—…μ€ μ• ν”Œλ¦¬μΌ€μ΄μ…˜μ΄ 이미지λ₯Ό μž…λ ₯λ°›μ•„ 또 λ‹€λ₯Έ 이미지λ₯Ό 좜λ ₯ν•˜λŠ” μž‘μ—…μž…λ‹ˆλ‹€. μ—¬κΈ°μ—λŠ” 이미지 ν–₯상(μ΄ˆκ³ ν•΄μƒλ„, 저쑰도 ν–₯상, 빗쀄기 제거 λ“±), 이미지 볡원 λ“± λ‹€μ–‘ν•œ ν•˜μœ„ μž‘μ—…μ΄ ν¬ν•¨λ©λ‹ˆλ‹€.

이 κ°€μ΄λ“œμ—μ„œλŠ” λ‹€μŒμ„ μˆ˜ν–‰ν•˜λŠ” 방법을 λ³΄μ—¬μ€λ‹ˆλ‹€.

  • μ΄ˆκ³ ν•΄μƒλ„ μž‘μ—…μ„ μœ„ν•œ image-to-image νŒŒμ΄ν”„λΌμΈ μ‚¬μš©,
  • νŒŒμ΄ν”„λΌμΈ 없이 λ™μΌν•œ μž‘μ—…μ„ μœ„ν•œ image-to-image λͺ¨λΈ μ‹€ν–‰

이 κ°€μ΄λ“œκ°€ λ°œν‘œλœ μ‹œμ μ—μ„œλŠ”, image-to-image νŒŒμ΄ν”„λΌμΈμ€ μ΄ˆκ³ ν•΄μƒλ„ μž‘μ—…λ§Œ μ§€μ›ν•œλ‹€λŠ” 점을 μœ μ˜ν•˜μ„Έμš”.

ν•„μš”ν•œ 라이브러리λ₯Ό μ„€μΉ˜ν•˜λŠ” 것뢀터 μ‹œμž‘ν•˜κ² μŠ΅λ‹ˆλ‹€.

pip install transformers

이제 Swin2SR λͺ¨λΈμ„ μ‚¬μš©ν•˜μ—¬ νŒŒμ΄ν”„λΌμΈμ„ μ΄ˆκΈ°ν™”ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 그런 λ‹€μŒ 이미지와 ν•¨κ»˜ ν˜ΈμΆœν•˜μ—¬ νŒŒμ΄ν”„λΌμΈμœΌλ‘œ μΆ”λ‘ ν•  수 μžˆμŠ΅λ‹ˆλ‹€. ν˜„μž¬ 이 νŒŒμ΄ν”„λΌμΈμ—μ„œλŠ” Swin2SR λͺ¨λΈλ§Œ μ§€μ›λ©λ‹ˆλ‹€.

from transformers import pipeline

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
pipe = pipeline(task="image-to-image", model="caidas/swin2SR-lightweight-x2-64", device=device)

이제 이미지λ₯Ό λΆˆλŸ¬μ™€ λ΄…μ‹œλ‹€.

from PIL import Image
import requests

url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/cat.jpg"
image = Image.open(requests.get(url, stream=True).raw)

print(image.size)
# (532, 432)
Photo of a cat

이제 νŒŒμ΄ν”„λΌμΈμœΌλ‘œ 좔둠을 μˆ˜ν–‰ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 고양이 μ΄λ―Έμ§€μ˜ μ—…μŠ€μΌ€μΌλœ 버전을 얻을 수 μžˆμŠ΅λ‹ˆλ‹€.

upscaled = pipe(image)
print(upscaled.size)
# (1072, 880)

νŒŒμ΄ν”„λΌμΈ 없이 직접 좔둠을 μˆ˜ν–‰ν•˜λ €λ©΄ Transformers의 Swin2SRForImageSuperResolution 및 Swin2SRImageProcessor 클래슀λ₯Ό μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 이λ₯Ό μœ„ν•΄ λ™μΌν•œ λͺ¨λΈ 체크포인트λ₯Ό μ‚¬μš©ν•©λ‹ˆλ‹€. λͺ¨λΈκ³Ό ν”„λ‘œμ„Έμ„œλ₯Ό μ΄ˆκΈ°ν™”ν•΄ λ³΄κ² μŠ΅λ‹ˆλ‹€.

from transformers import Swin2SRForImageSuperResolution, Swin2SRImageProcessor 

model = Swin2SRForImageSuperResolution.from_pretrained("caidas/swin2SR-lightweight-x2-64").to(device)
processor = Swin2SRImageProcessor("caidas/swin2SR-lightweight-x2-64")

pipeline μš°λ¦¬κ°€ 직접 μˆ˜ν–‰ν•΄μ•Ό ν•˜λŠ” μ „μ²˜λ¦¬μ™€ ν›„μ²˜λ¦¬ 단계λ₯Ό μΆ”μƒν™”ν•˜λ―€λ‘œ, 이미지λ₯Ό μ „μ²˜λ¦¬ν•΄ λ³΄κ² μŠ΅λ‹ˆλ‹€. 이미지λ₯Ό ν”„λ‘œμ„Έμ„œμ— μ „λ‹¬ν•œ λ‹€μŒ 픽셀값을 GPU둜 μ΄λ™μ‹œν‚€κ² μŠ΅λ‹ˆλ‹€.

pixel_values = processor(image, return_tensors="pt").pixel_values
print(pixel_values.shape)

pixel_values = pixel_values.to(device)

이제 픽셀값을 λͺ¨λΈμ— μ „λ‹¬ν•˜μ—¬ 이미지λ₯Ό μΆ”λ‘ ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

import torch

with torch.no_grad():
  outputs = model(pixel_values)

좜λ ₯은 μ•„λž˜μ™€ 같은 ImageSuperResolutionOutput μœ ν˜•μ˜ κ°μ²΄μž…λ‹ˆλ‹€ πŸ‘‡

(loss=None, reconstruction=tensor([[[[0.8270, 0.8269, 0.8275,  ..., 0.7463, 0.7446, 0.7453],
          [0.8287, 0.8278, 0.8283,  ..., 0.7451, 0.7448, 0.7457],
          [0.8280, 0.8273, 0.8269,  ..., 0.7447, 0.7446, 0.7452],
          ...,
          [0.5923, 0.5933, 0.5924,  ..., 0.0697, 0.0695, 0.0706],
          [0.5926, 0.5932, 0.5926,  ..., 0.0673, 0.0687, 0.0705],
          [0.5927, 0.5914, 0.5922,  ..., 0.0664, 0.0694, 0.0718]]]],
       device='cuda:0'), hidden_states=None, attentions=None)

reconstructionλ₯Ό 가져와 μ‹œκ°ν™”λ₯Ό μœ„ν•΄ ν›„μ²˜λ¦¬ν•΄μ•Ό ν•©λ‹ˆλ‹€. μ–΄λ–»κ²Œ μƒκ²ΌλŠ”μ§€ μ‚΄νŽ΄λ΄…μ‹œλ‹€.

outputs.reconstruction.data.shape
# torch.Size([1, 3, 880, 1072])

좜λ ₯ ν…μ„œμ˜ 차원을 μΆ•μ†Œν•˜κ³  0번째 좕을 μ œκ±°ν•œ λ‹€μŒ, 값을 ν΄λ¦¬ν•‘ν•˜κ³  NumPy λΆ€λ™μ†Œμˆ˜μ  λ°°μ—΄λ‘œ λ³€ν™˜ν•΄μ•Ό ν•©λ‹ˆλ‹€. 그런 λ‹€μŒ [1072, 880] λͺ¨μ–‘을 갖도둝 좕을 μž¬μ •λ ¬ν•˜κ³  λ§ˆμ§€λ§‰μœΌλ‘œ 좜λ ₯을 0κ³Ό 255 μ‚¬μ΄μ˜ 값을 갖도둝 λ˜λŒλ¦½λ‹ˆλ‹€.

import numpy as np

# 크기λ₯Ό 쀄이고, CPU둜 μ΄λ™ν•˜κ³ , 값을 클리핑
output = outputs.reconstruction.data.squeeze().cpu().clamp_(0, 1).numpy()
# 좕을 μž¬μ •λ ¬
output = np.moveaxis(output, source=0, destination=-1)
# 값을 ν”½μ…€κ°’ λ²”μœ„λ‘œ 되돌리기
output = (output * 255.0).round().astype(np.uint8)
Image.fromarray(output)
Upscaled photo of a cat