dsaint31's picture
Update README.md
af2273f verified
metadata
library_name: transformers
tags:
  - transformers
  - pipeline
  - vision
  - image-classification
  - vit
  - imagenet-1k
license: apache-2.0
datasets:
  - ILSVRC/imagenet-1k
base_model:
  - google/vit-base-patch16-224
pipeline_tag: image-classification

Model Card for tmp-pl-image-classification

์ด ์ €์žฅ์†Œ๋Š” ๐Ÿค— Transformers์˜ pipeline() ๋™์ž‘์„ ์ดํ•ดํ•˜๊ณ  ์—ฐ์Šตํ•˜๊ธฐ ์œ„ํ•œ ํ•™์Šต์šฉ(pipeline practice) ๋ชจ๋ธ repo ์ž…๋‹ˆ๋‹ค.
๋ชจ๋ธ ๊ฐ€์ค‘์น˜๋Š” ์›๋ณธ ๋ชจ๋ธ google/vit-base-patch16-224 ์„ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•˜๋ฉฐ, ์ถ”๊ฐ€์ ์ธ fine-tuning์€ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.


Model Details

Model Description

๋ณธ ๋ชจ๋ธ์€ Vision Transformer(ViT) ๊ธฐ๋ฐ˜ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ pipeline("image-classification") ํ˜•ํƒœ๋กœ
Hub์— ์—…๋กœ๋“œํ•˜๊ณ  ๋‹ค์‹œ ๋ถˆ๋Ÿฌ์˜ค๋Š” ์ „์ฒด ํ๋ฆ„์„ ์‹ค์Šตํ•˜๊ธฐ ์œ„ํ•ด ๊ตฌ์„ฑ๋จ.

  • Developed by: Google Research (์›๋ณธ๋ชจ๋ธ)
  • Shared by [optional]: dsaint31
  • Model type: Image Classification (Vision Transformer)
  • Language(s) (NLP): ํ•ด๋‹น ์—†์Œ (์ด๋ฏธ์ง€ ์ž…๋ ฅ)
  • License: Apache-2.0
  • Finetuned from model [optional]: google/vit-base-patch16-224 (๊ฐ€์ค‘์น˜ ๋ณ€๊ฒฝ ์—†์Œ. fine-tuning ๋ฏธ์ˆ˜ํ–‰)

Model Sources [optional]

Uses

Direct Use

  • pipeline("image-classification", model=...) ์‚ฌ์šฉ๋ฒ• ์‹ค์Šต
  • Hugging Face Hub์— pipeline ํ˜•ํƒœ๋กœ ๋ชจ๋ธ์„ ์—…๋กœ๋“œ / ๋‹ค์šด๋กœ๋“œํ•˜๋Š” ํ๋ฆ„ ์ดํ•ด
  • Vision ๋ชจ๋ธ๊ณผ pipeline์˜ ๊ด€๊ณ„ ํ•™์Šต

Downstream Use [optional]

  • ๋ณธ repo ์ž์ฒด๋Š” downstream task๋ฅผ ์œ„ํ•œ fine-tuning์„ ๋ชฉ์ ์œผ๋กœ ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ํ•™์Šต ๋˜๋Š” ์„ฑ๋Šฅ ๋น„๊ต ๋ชฉ์ ์ด๋ผ๋ฉด ์›๋ณธ ๋ชจ๋ธ repo๋ฅผ ์ง์ ‘ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ ์ ˆํ•ฉ๋‹ˆ๋‹ค.

Out-of-Scope Use

  • ๋ชจ๋ธ ์„ฑ๋Šฅ ํ‰๊ฐ€ ๋˜๋Š” ๋ฒค์น˜๋งˆํฌ
  • ์‹ค์ œ ์„œ๋น„์Šค ํ™˜๊ฒฝ์—์„œ์˜ ๋ชจ๋ธ ๋ฐฐํฌ
  • ํŠน์ • ๋„๋ฉ”์ธ(์˜๋ฃŒ, ์‚ฐ์—… ์˜์ƒ ๋“ฑ)์— ๋Œ€ํ•œ ์‹ ๋ขฐ์„ฑ ์žˆ๋Š” ์ถ”๋ก 

Bias, Risks, and Limitations

  • ๋ณธ ๋ชจ๋ธ์€ ImageNet ๊ธฐ๋ฐ˜ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต๋œ ์ผ๋ฐ˜ ๋ชฉ์  ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ํŠน์„ฑ์„ ๊ทธ๋Œ€๋กœ ๊ฐ€์ง‘๋‹ˆ๋‹ค.
  • ํŠน์ • ๊ฐ์ฒด, ๋ฌธํ™”์  ๋งฅ๋ฝ, ์ „๋ฌธ ๋„๋ฉ”์ธ์— ๋Œ€ํ•œ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์€ ๋ณด์žฅ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
  • ๋ณธ repo๋Š” ์—ฐ์Šต์šฉ pipeline ์ €์žฅ์†Œ์ด๋ฏ€๋กœ ๋ชจ๋ธ์˜ ์‚ฌํšŒ์  ์˜ํ–ฅ์ด๋‚˜ ํŽธํ–ฅ ๋ถ„์„์„ ๋ชฉ์ ์œผ๋กœ ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

  • ์‹ค์ œ ์‚ฌ์šฉ ๋ชฉ์ ์ด ์žˆ๋Š” ๊ฒฝ์šฐ, ์›๋ณธ ๋ชจ๋ธ ์นด๋“œ(google/vit-base-patch16-224)์˜ ์ œํ•œ ์‚ฌํ•ญ์„ ๋ฐ˜๋“œ์‹œ ์ฐธ๊ณ ํ•˜์‹ญ์‹œ์˜ค.
  • ์ด repo๋Š” ํ•™์Šต ๋ฐ ์‹ค์Šต ๋ชฉ์ ์— ํ•œํ•ด ์‚ฌ์šฉํ•˜๊ธฐ๋ฅผ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.

How to Get Started with the Model

Use the code below to get started with the model.

์•„๋ž˜ ์˜ˆ์ œ๋Š” Hugging Face pipeline์„ ์ด์šฉํ•ด ๋ณธ ๋ชจ๋ธ์„ ๋กœ๋“œํ•˜๊ณ  ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ์ตœ์†Œ ์˜ˆ์ œ์ž…๋‹ˆ๋‹ค.

from transformers import pipeline
from PIL import Image
import requests

img_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/cats.png"
image = Image.open(requests.get(img_url, stream=True).raw)

clf = pipeline(
    task="image-classification",
    model="dsaint31/tmp-pl-image-classification",
)

print(clf(image))

[More Information Needed]

Training Details

Training Data

  • ๋ณธ repo์—์„œ๋Š” ์ถ”๊ฐ€ ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.
  • ์›๋ณธ ๋ชจ๋ธ์€ ImageNet-21k๋กœ ์‚ฌ์ „ํ•™์Šต(pretraining) ํ›„ ImageNet-1k๋กœ fine-tuning๋œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

Training Procedure

Preprocessing [optional]

  • ์›๋ณธ ViT ๋ชจ๋ธ์˜ ๊ธฐ๋ณธ ์ด๋ฏธ์ง€ ์ „์ฒ˜๋ฆฌ(Image Processor)๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

[More Information Needed]

Training Hyperparameters

  • Training regime: ํ•ด๋‹น ์—†์Œ(ํ•™์Šต ๋ฏธ์ˆ˜ํ–‰) [More Information Needed]

Speeds, Sizes, Times [optional]

Evaluation

Testing Data, Factors & Metrics

Testing Data

  • ๋ณธ repo์—์„œ๋Š” ๋ณ„๋„์˜ ํ‰๊ฐ€๋ฅผ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.
  • ์„ฑ๋Šฅ ์ง€ํ‘œ๋Š” ์›๋ณธ ๋ชจ๋ธ ์นด๋“œ์˜ ํ‰๊ฐ€ ๊ฒฐ๊ณผ๋ฅผ ์ฐธ๊ณ ํ•˜์‹ญ์‹œ์˜ค.

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

๋ณธ repo์—์„œ๋Š” ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์•˜์œผ๋ฏ€๋กœ ์ถ”๊ฐ€์ ์ธ ํ™˜๊ฒฝ์  ์˜ํ–ฅ์€ ์—†์Šต๋‹ˆ๋‹ค.

  • ์›๋ณธ ๋ชจ๋ธ ํ•™์Šต์— ๋Œ€ํ•œ ํ™˜๊ฒฝ ์˜ํ–ฅ์€ base model ๋ฌธ์„œ๋ฅผ ์ฐธ๊ณ ํ•˜์‹ญ์‹œ์˜ค.

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

  • Vision Transformer (ViT-Base, patch size 16, input resolution 224x224)
  • Objective: Image classification

Compute Infrastructure

[More Information Needed]

Hardware

  • ํ•ด๋‹น์—†์Œ (ํ•™์Šต ๋ฏธ์ˆ˜ํ–‰)

Software

  • Transformers
  • Pillow
  • PyTorch

[More Information Needed]

Citation [optional]

์›๋ณธ ๋ชจ๋ธ ์ธ์šฉ ์‹œ ์•„๋ž˜ ๋…ผ๋ฌธ์„ ์ฐธ๊ณ ํ•˜์‹ญ์‹œ์˜ค.

BibTeX:

@article{dosovitskiy2020image,
  title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
  author={Dosovitskiy, Alexey and others},
  journal={arXiv preprint arXiv:2010.11929},
  year={2020}
}

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

  • dsaint31 (pipeline practice repository)

Model Card Contact