| <!--Copyright 2023 The HuggingFace Team. All rights reserved. | |
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | |
| the License. You may obtain a copy of the License at | |
| http://www.apache.org/licenses/LICENSE-2.0 | |
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | |
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | |
| specific language governing permissions and limitations under the License. | |
| โ ๏ธ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be | |
| rendered properly in your Markdown viewer. | |
| --> | |
| # ๋จ์ผ ์์ ๊ธฐ๋ฐ ๊น์ด ์ถ์ [[depth-estimation-pipeline]] | |
| ๋จ์ผ ์์ ๊ธฐ๋ฐ ๊น์ด ์ถ์ ์ ํ ์ฅ๋ฉด์ ๋จ์ผ ์ด๋ฏธ์ง์์ ์ฅ๋ฉด์ ๊น์ด ์ ๋ณด๋ฅผ ์์ธกํ๋ ์ปดํจํฐ ๋น์ ์์ ์ ๋๋ค. | |
| ์ฆ, ๋จ์ผ ์นด๋ฉ๋ผ ์์ ์ ์ฅ๋ฉด์ ์๋ ๋ฌผ์ฒด์ ๊ฑฐ๋ฆฌ๋ฅผ ์์ธกํ๋ ๊ณผ์ ์ ๋๋ค. | |
| ๋จ์ผ ์์ ๊ธฐ๋ฐ ๊น์ด ์ถ์ ์ 3D ์ฌ๊ตฌ์ฑ, ์ฆ๊ฐ ํ์ค, ์์จ ์ฃผํ, ๋ก๋ด ๊ณตํ ๋ฑ ๋ค์ํ ๋ถ์ผ์์ ์์ฉ๋ฉ๋๋ค. | |
| ์กฐ๋ช ์กฐ๊ฑด, ๊ฐ๋ ค์ง, ํ ์ค์ฒ์ ๊ฐ์ ์์์ ์ํฅ์ ๋ฐ์ ์ ์๋ ์ฅ๋ฉด ๋ด ๋ฌผ์ฒด์ ํด๋น ๊น์ด ์ ๋ณด ๊ฐ์ ๋ณต์กํ ๊ด๊ณ๋ฅผ ๋ชจ๋ธ์ด ์ดํดํด์ผ ํ๋ฏ๋ก ๊น๋ค๋ก์ด ์์ ์ ๋๋ค. | |
| <Tip> | |
| ์ด ์์ ๊ณผ ํธํ๋๋ ๋ชจ๋ ์ํคํ ์ฒ์ ์ฒดํฌํฌ์ธํธ๋ฅผ ๋ณด๋ ค๋ฉด [์์ ํ์ด์ง](https://huggingface.co/tasks/depth-estimation)๋ฅผ ํ์ธํ๋ ๊ฒ์ด ์ข์ต๋๋ค. | |
| </Tip> | |
| ์ด๋ฒ ๊ฐ์ด๋์์ ๋ฐฐ์ธ ๋ด์ฉ์ ๋ค์๊ณผ ๊ฐ์ต๋๋ค: | |
| * ๊น์ด ์ถ์ ํ์ดํ๋ผ์ธ ๋ง๋ค๊ธฐ | |
| * ์ง์ ๊น์ด ์ถ์ ์ถ๋ก ํ๊ธฐ | |
| ์์ํ๊ธฐ ์ ์, ํ์ํ ๋ชจ๋ ๋ผ์ด๋ธ๋ฌ๋ฆฌ๊ฐ ์ค์น๋์ด ์๋์ง ํ์ธํ์ธ์: | |
| ```bash | |
| pip install -q transformers | |
| ``` | |
| ## ๊น์ด ์ถ์ ํ์ดํ๋ผ์ธ[[depth-estimation-inference-by-hand]] | |
| ๊น์ด ์ถ์ ์ ์ถ๋ก ํ๋ ๊ฐ์ฅ ๊ฐ๋จํ ๋ฐฉ๋ฒ์ ํด๋น ๊ธฐ๋ฅ์ ์ ๊ณตํ๋ [`pipeline`]์ ์ฌ์ฉํ๋ ๊ฒ์ ๋๋ค. | |
| [Hugging Face Hub ์ฒดํฌํฌ์ธํธ](https://huggingface.co/models?pipeline_tag=depth-estimation&sort=downloads)์์ ํ์ดํ๋ผ์ธ์ ์ด๊ธฐํํฉ๋๋ค: | |
| ```py | |
| >>> from transformers import pipeline | |
| >>> checkpoint = "vinvino02/glpn-nyu" | |
| >>> depth_estimator = pipeline("depth-estimation", model=checkpoint) | |
| ``` | |
| ๋ค์์ผ๋ก, ๋ถ์ํ ์ด๋ฏธ์ง๋ฅผ ํ ์ฅ ์ ํํ์ธ์: | |
| ```py | |
| >>> from PIL import Image | |
| >>> import requests | |
| >>> url = "https://unsplash.com/photos/HwBAsSbPBDU/download?ixid=MnwxMjA3fDB8MXxzZWFyY2h8MzR8fGNhciUyMGluJTIwdGhlJTIwc3RyZWV0fGVufDB8MHx8fDE2Nzg5MDEwODg&force=true&w=640" | |
| >>> image = Image.open(requests.get(url, stream=True).raw) | |
| >>> image | |
| ``` | |
| <div class="flex justify-center"> | |
| <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/depth-estimation-example.jpg" alt="Photo of a busy street"/> | |
| </div> | |
| ์ด๋ฏธ์ง๋ฅผ ํ์ดํ๋ผ์ธ์ผ๋ก ์ ๋ฌํฉ๋๋ค. | |
| ```py | |
| >>> predictions = depth_estimator(image) | |
| ``` | |
| ํ์ดํ๋ผ์ธ์ ๋ ๊ฐ์ ํญ๋ชฉ์ ๊ฐ์ง๋ ๋์ ๋๋ฆฌ๋ฅผ ๋ฐํํฉ๋๋ค. | |
| ์ฒซ ๋ฒ์งธ๋ `predicted_depth`๋ก ๊ฐ ํฝ์ ์ ๊น์ด๋ฅผ ๋ฏธํฐ๋ก ํํํ ๊ฐ์ ๊ฐ์ง๋ ํ ์์ ๋๋ค. | |
| ๋ ๋ฒ์งธ๋ `depth`๋ก ๊น์ด ์ถ์ ๊ฒฐ๊ณผ๋ฅผ ์๊ฐํํ๋ PIL ์ด๋ฏธ์ง์ ๋๋ค. | |
| ์ด์ ์๊ฐํํ ๊ฒฐ๊ณผ๋ฅผ ์ดํด๋ณด๊ฒ ์ต๋๋ค: | |
| ```py | |
| >>> predictions["depth"] | |
| ``` | |
| <div class="flex justify-center"> | |
| <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/depth-visualization.png" alt="Depth estimation visualization"/> | |
| </div> | |
| ## ์ง์ ๊น์ด ์ถ์ ์ถ๋ก ํ๊ธฐ[[depth-estimation-inference-by-hand]] | |
| ์ด์ ๊น์ด ์ถ์ ํ์ดํ๋ผ์ธ ์ฌ์ฉ๋ฒ์ ์ดํด๋ณด์์ผ๋ ๋์ผํ ๊ฒฐ๊ณผ๋ฅผ ๋ณต์ ํ๋ ๋ฐฉ๋ฒ์ ์ดํด๋ณด๊ฒ ์ต๋๋ค. | |
| [Hugging Face Hub ์ฒดํฌํฌ์ธํธ](https://huggingface.co/models?pipeline_tag=depth-estimation&sort=downloads)์์ ๋ชจ๋ธ๊ณผ ๊ด๋ จ ํ๋ก์ธ์๋ฅผ ๊ฐ์ ธ์ค๋ ๊ฒ๋ถํฐ ์์ํฉ๋๋ค. | |
| ์ฌ๊ธฐ์ ์ด์ ์ ์ฌ์ฉํ ์ฒดํฌํฌ์ธํธ์ ๋์ผํ ๊ฒ์ ์ฌ์ฉํฉ๋๋ค: | |
| ```py | |
| >>> from transformers import AutoImageProcessor, AutoModelForDepthEstimation | |
| >>> checkpoint = "vinvino02/glpn-nyu" | |
| >>> image_processor = AutoImageProcessor.from_pretrained(checkpoint) | |
| >>> model = AutoModelForDepthEstimation.from_pretrained(checkpoint) | |
| ``` | |
| ํ์ํ ์ด๋ฏธ์ง ๋ณํ์ ์ฒ๋ฆฌํ๋ `image_processor`๋ฅผ ์ฌ์ฉํ์ฌ ๋ชจ๋ธ์ ๋ํ ์ด๋ฏธ์ง ์ ๋ ฅ์ ์ค๋นํฉ๋๋ค. | |
| `image_processor`๋ ํฌ๊ธฐ ์กฐ์ ๋ฐ ์ ๊ทํ ๋ฑ ํ์ํ ์ด๋ฏธ์ง ๋ณํ์ ์ฒ๋ฆฌํฉ๋๋ค: | |
| ```py | |
| >>> pixel_values = image_processor(image, return_tensors="pt").pixel_values | |
| ``` | |
| ์ค๋นํ ์ ๋ ฅ์ ๋ชจ๋ธ๋ก ์ ๋ฌํฉ๋๋ค: | |
| ```py | |
| >>> import torch | |
| >>> with torch.no_grad(): | |
| ... outputs = model(pixel_values) | |
| ... predicted_depth = outputs.predicted_depth | |
| ``` | |
| ๊ฒฐ๊ณผ๋ฅผ ์๊ฐํํฉ๋๋ค: | |
| ```py | |
| >>> import numpy as np | |
| >>> # ์๋ณธ ์ฌ์ด์ฆ๋ก ๋ณต์ | |
| >>> prediction = torch.nn.functional.interpolate( | |
| ... predicted_depth.unsqueeze(1), | |
| ... size=image.size[::-1], | |
| ... mode="bicubic", | |
| ... align_corners=False, | |
| ... ).squeeze() | |
| >>> output = prediction.numpy() | |
| >>> formatted = (output * 255 / np.max(output)).astype("uint8") | |
| >>> depth = Image.fromarray(formatted) | |
| >>> depth | |
| ``` | |
| <div class="flex justify-center"> | |
| <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/depth-visualization.png" alt="Depth estimation visualization"/> | |
| </div> | |