| <!--Copyright 2023 The HuggingFace Team. All rights reserved. | |
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
| the License. You may obtain a copy of the License at | |
| http://www.apache.org/licenses/LICENSE-2.0 | |
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
| specific language governing permissions and limitations under the License. | |
| β οΈ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | |
| rendered properly in your Markdown viewer. | |
| --> | |
| # Image-to-Image μμ κ°μ΄λ [[image-to-image-task-guide]] | |
| [[open-in-colab]] | |
| Image-to-Image μμ μ μ ν리μΌμ΄μ μ΄ μ΄λ―Έμ§λ₯Ό μ λ ₯λ°μ λ λ€λ₯Έ μ΄λ―Έμ§λ₯Ό μΆλ ₯νλ μμ μ λλ€. μ¬κΈ°μλ μ΄λ―Έμ§ ν₯μ(μ΄κ³ ν΄μλ, μ μ‘°λ ν₯μ, λΉμ€κΈ° μ κ±° λ±), μ΄λ―Έμ§ 볡μ λ± λ€μν νμ μμ μ΄ ν¬ν¨λ©λλ€. | |
| μ΄ κ°μ΄λμμλ λ€μμ μννλ λ°©λ²μ 보μ¬μ€λλ€. | |
| - μ΄κ³ ν΄μλ μμ μ μν image-to-image νμ΄νλΌμΈ μ¬μ©, | |
| - νμ΄νλΌμΈ μμ΄ λμΌν μμ μ μν image-to-image λͺ¨λΈ μ€ν | |
| μ΄ κ°μ΄λκ° λ°νλ μμ μμλ, `image-to-image` νμ΄νλΌμΈμ μ΄κ³ ν΄μλ μμ λ§ μ§μνλ€λ μ μ μ μνμΈμ. | |
| νμν λΌμ΄λΈλ¬λ¦¬λ₯Ό μ€μΉνλ κ²λΆν° μμνκ² μ΅λλ€. | |
| ```bash | |
| pip install transformers | |
| ``` | |
| μ΄μ [Swin2SR λͺ¨λΈ](https://huggingface.co/caidas/swin2SR-lightweight-x2-64)μ μ¬μ©νμ¬ νμ΄νλΌμΈμ μ΄κΈ°νν μ μμ΅λλ€. κ·Έλ° λ€μ μ΄λ―Έμ§μ ν¨κ» νΈμΆνμ¬ νμ΄νλΌμΈμΌλ‘ μΆλ‘ ν μ μμ΅λλ€. νμ¬ μ΄ νμ΄νλΌμΈμμλ [Swin2SR λͺ¨λΈ](https://huggingface.co/caidas/swin2SR-lightweight-x2-64)λ§ μ§μλ©λλ€. | |
| ```python | |
| from transformers import pipeline | |
| device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') | |
| pipe = pipeline(task="image-to-image", model="caidas/swin2SR-lightweight-x2-64", device=device) | |
| ``` | |
| μ΄μ μ΄λ―Έμ§λ₯Ό λΆλ¬μ λ΄ μλ€. | |
| ```python | |
| from PIL import Image | |
| import requests | |
| url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/cat.jpg" | |
| image = Image.open(requests.get(url, stream=True).raw) | |
| print(image.size) | |
| ``` | |
| ```bash | |
| # (532, 432) | |
| ``` | |
| <div class="flex justify-center"> | |
| <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/cat.jpg" alt="Photo of a cat"/> | |
| </div> | |
| μ΄μ νμ΄νλΌμΈμΌλ‘ μΆλ‘ μ μνν μ μμ΅λλ€. κ³ μμ΄ μ΄λ―Έμ§μ μ μ€μΌμΌλ λ²μ μ μ»μ μ μμ΅λλ€. | |
| ```python | |
| upscaled = pipe(image) | |
| print(upscaled.size) | |
| ``` | |
| ```bash | |
| # (1072, 880) | |
| ``` | |
| νμ΄νλΌμΈ μμ΄ μ§μ μΆλ‘ μ μννλ €λ©΄ Transformersμ `Swin2SRForImageSuperResolution` λ° `Swin2SRImageProcessor` ν΄λμ€λ₯Ό μ¬μ©ν μ μμ΅λλ€. μ΄λ₯Ό μν΄ λμΌν λͺ¨λΈ 체ν¬ν¬μΈνΈλ₯Ό μ¬μ©ν©λλ€. λͺ¨λΈκ³Ό νλ‘μΈμλ₯Ό μ΄κΈ°νν΄ λ³΄κ² μ΅λλ€. | |
| ```python | |
| from transformers import Swin2SRForImageSuperResolution, Swin2SRImageProcessor | |
| model = Swin2SRForImageSuperResolution.from_pretrained("caidas/swin2SR-lightweight-x2-64").to(device) | |
| processor = Swin2SRImageProcessor("caidas/swin2SR-lightweight-x2-64") | |
| ``` | |
| `pipeline` μ°λ¦¬κ° μ§μ μνν΄μΌ νλ μ μ²λ¦¬μ νμ²λ¦¬ λ¨κ³λ₯Ό μΆμννλ―λ‘, μ΄λ―Έμ§λ₯Ό μ μ²λ¦¬ν΄ λ³΄κ² μ΅λλ€. μ΄λ―Έμ§λ₯Ό νλ‘μΈμμ μ λ¬ν λ€μ ν½μ κ°μ GPUλ‘ μ΄λμν€κ² μ΅λλ€. | |
| ```python | |
| pixel_values = processor(image, return_tensors="pt").pixel_values | |
| print(pixel_values.shape) | |
| pixel_values = pixel_values.to(device) | |
| ``` | |
| μ΄μ ν½μ κ°μ λͺ¨λΈμ μ λ¬νμ¬ μ΄λ―Έμ§λ₯Ό μΆλ‘ ν μ μμ΅λλ€. | |
| ```python | |
| import torch | |
| with torch.no_grad(): | |
| outputs = model(pixel_values) | |
| ``` | |
| μΆλ ₯μ μλμ κ°μ `ImageSuperResolutionOutput` μ νμ κ°μ²΄μ λλ€ π | |
| ``` | |
| (loss=None, reconstruction=tensor([[[[0.8270, 0.8269, 0.8275, ..., 0.7463, 0.7446, 0.7453], | |
| [0.8287, 0.8278, 0.8283, ..., 0.7451, 0.7448, 0.7457], | |
| [0.8280, 0.8273, 0.8269, ..., 0.7447, 0.7446, 0.7452], | |
| ..., | |
| [0.5923, 0.5933, 0.5924, ..., 0.0697, 0.0695, 0.0706], | |
| [0.5926, 0.5932, 0.5926, ..., 0.0673, 0.0687, 0.0705], | |
| [0.5927, 0.5914, 0.5922, ..., 0.0664, 0.0694, 0.0718]]]], | |
| device='cuda:0'), hidden_states=None, attentions=None) | |
| ``` | |
| `reconstruction`λ₯Ό κ°μ Έμ μκ°νλ₯Ό μν΄ νμ²λ¦¬ν΄μΌ ν©λλ€. μ΄λ»κ² μκ²Όλμ§ μ΄ν΄λ΄ μλ€. | |
| ```python | |
| outputs.reconstruction.data.shape | |
| # torch.Size([1, 3, 880, 1072]) | |
| ``` | |
| μΆλ ₯ ν μμ μ°¨μμ μΆμνκ³ 0λ²μ§Έ μΆμ μ κ±°ν λ€μ, κ°μ ν΄λ¦¬ννκ³ NumPy λΆλμμμ λ°°μ΄λ‘ λ³νν΄μΌ ν©λλ€. κ·Έλ° λ€μ [1072, 880] λͺ¨μμ κ°λλ‘ μΆμ μ¬μ λ ¬νκ³ λ§μ§λ§μΌλ‘ μΆλ ₯μ 0κ³Ό 255 μ¬μ΄μ κ°μ κ°λλ‘ λλ립λλ€. | |
| ```python | |
| import numpy as np | |
| # ν¬κΈ°λ₯Ό μ€μ΄κ³ , CPUλ‘ μ΄λνκ³ , κ°μ ν΄λ¦¬ν | |
| output = outputs.reconstruction.data.squeeze().cpu().clamp_(0, 1).numpy() | |
| # μΆμ μ¬μ λ ¬ | |
| output = np.moveaxis(output, source=0, destination=-1) | |
| # κ°μ ν½μ κ° λ²μλ‘ λλ리기 | |
| output = (output * 255.0).round().astype(np.uint8) | |
| Image.fromarray(output) | |
| ``` | |
| <div class="flex justify-center"> | |
| <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/cat_upscaled.png" alt="Upscaled photo of a cat"/> | |
| </div> | |