dsaint31's picture
Update README.md
af2273f verified
---
library_name: transformers
tags:
- transformers
- pipeline
- vision
- image-classification
- vit
- imagenet-1k
license: apache-2.0
datasets:
- ILSVRC/imagenet-1k
base_model:
- google/vit-base-patch16-224
pipeline_tag: image-classification
---
# Model Card for tmp-pl-image-classification
์ด ์ €์žฅ์†Œ๋Š” ๐Ÿค— Transformers์˜ `pipeline()` ๋™์ž‘์„ ์ดํ•ดํ•˜๊ณ  ์—ฐ์Šตํ•˜๊ธฐ ์œ„ํ•œ **ํ•™์Šต์šฉ(pipeline practice) ๋ชจ๋ธ repo** ์ž…๋‹ˆ๋‹ค.
๋ชจ๋ธ ๊ฐ€์ค‘์น˜๋Š” ์›๋ณธ ๋ชจ๋ธ **`google/vit-base-patch16-224`** ์„ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•˜๋ฉฐ, ์ถ”๊ฐ€์ ์ธ fine-tuning์€ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.
---
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
๋ณธ ๋ชจ๋ธ์€ **Vision Transformer(ViT)** ๊ธฐ๋ฐ˜ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ `pipeline("image-classification")` ํ˜•ํƒœ๋กœ
Hub์— ์—…๋กœ๋“œํ•˜๊ณ  ๋‹ค์‹œ ๋ถˆ๋Ÿฌ์˜ค๋Š” ์ „์ฒด ํ๋ฆ„์„ ์‹ค์Šตํ•˜๊ธฐ ์œ„ํ•ด ๊ตฌ์„ฑ๋จ.
- **Developed by:** Google Research (์›๋ณธ๋ชจ๋ธ)
- **Shared by [optional]:** dsaint31
- **Model type:** Image Classification (Vision Transformer)
- **Language(s) (NLP):** ํ•ด๋‹น ์—†์Œ (์ด๋ฏธ์ง€ ์ž…๋ ฅ)
- **License:** Apache-2.0
- **Finetuned from model [optional]:** google/vit-base-patch16-224 (๊ฐ€์ค‘์น˜ ๋ณ€๊ฒฝ ์—†์Œ. fine-tuning ๋ฏธ์ˆ˜ํ–‰)
### Model Sources [optional]
<!-- Provide the basic links for the model. -->
- **Baee model Repository:** [https://huggingface.co/google/vit-base-patch16-224](https://huggingface.co/google/vit-base-patch16-224)
- **Paper [optional]:** [Dosovitskiy et al., *An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale*, arXiv:2010.11929](https://arxiv.org/abs/2010.11929)
- **Demo [optional]:** None
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Direct Use
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
- `pipeline("image-classification", model=...)` ์‚ฌ์šฉ๋ฒ• ์‹ค์Šต
- Hugging Face Hub์— pipeline ํ˜•ํƒœ๋กœ ๋ชจ๋ธ์„ ์—…๋กœ๋“œ / ๋‹ค์šด๋กœ๋“œํ•˜๋Š” ํ๋ฆ„ ์ดํ•ด
- Vision ๋ชจ๋ธ๊ณผ pipeline์˜ ๊ด€๊ณ„ ํ•™์Šต
### Downstream Use [optional]
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
- ๋ณธ repo ์ž์ฒด๋Š” downstream task๋ฅผ ์œ„ํ•œ fine-tuning์„ ๋ชฉ์ ์œผ๋กœ ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
- ํ•™์Šต ๋˜๋Š” ์„ฑ๋Šฅ ๋น„๊ต ๋ชฉ์ ์ด๋ผ๋ฉด **์›๋ณธ ๋ชจ๋ธ repo**๋ฅผ ์ง์ ‘ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ ์ ˆํ•ฉ๋‹ˆ๋‹ค.
### Out-of-Scope Use
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
- ๋ชจ๋ธ ์„ฑ๋Šฅ ํ‰๊ฐ€ ๋˜๋Š” ๋ฒค์น˜๋งˆํฌ
- ์‹ค์ œ ์„œ๋น„์Šค ํ™˜๊ฒฝ์—์„œ์˜ ๋ชจ๋ธ ๋ฐฐํฌ
- ํŠน์ • ๋„๋ฉ”์ธ(์˜๋ฃŒ, ์‚ฐ์—… ์˜์ƒ ๋“ฑ)์— ๋Œ€ํ•œ ์‹ ๋ขฐ์„ฑ ์žˆ๋Š” ์ถ”๋ก 
## Bias, Risks, and Limitations
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
- ๋ณธ ๋ชจ๋ธ์€ ImageNet ๊ธฐ๋ฐ˜ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต๋œ ์ผ๋ฐ˜ ๋ชฉ์  ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ํŠน์„ฑ์„ ๊ทธ๋Œ€๋กœ ๊ฐ€์ง‘๋‹ˆ๋‹ค.
- ํŠน์ • ๊ฐ์ฒด, ๋ฌธํ™”์  ๋งฅ๋ฝ, ์ „๋ฌธ ๋„๋ฉ”์ธ์— ๋Œ€ํ•œ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์€ ๋ณด์žฅ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
- ๋ณธ repo๋Š” **์—ฐ์Šต์šฉ pipeline ์ €์žฅ์†Œ**์ด๋ฏ€๋กœ ๋ชจ๋ธ์˜ ์‚ฌํšŒ์  ์˜ํ–ฅ์ด๋‚˜ ํŽธํ–ฅ ๋ถ„์„์„ ๋ชฉ์ ์œผ๋กœ ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
### Recommendations
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
- ์‹ค์ œ ์‚ฌ์šฉ ๋ชฉ์ ์ด ์žˆ๋Š” ๊ฒฝ์šฐ, ์›๋ณธ ๋ชจ๋ธ ์นด๋“œ(`google/vit-base-patch16-224`)์˜ ์ œํ•œ ์‚ฌํ•ญ์„ ๋ฐ˜๋“œ์‹œ ์ฐธ๊ณ ํ•˜์‹ญ์‹œ์˜ค.
- ์ด repo๋Š” ํ•™์Šต ๋ฐ ์‹ค์Šต ๋ชฉ์ ์— ํ•œํ•ด ์‚ฌ์šฉํ•˜๊ธฐ๋ฅผ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.
## How to Get Started with the Model
Use the code below to get started with the model.
์•„๋ž˜ ์˜ˆ์ œ๋Š” Hugging Face `pipeline`์„ ์ด์šฉํ•ด ๋ณธ ๋ชจ๋ธ์„ ๋กœ๋“œํ•˜๊ณ  ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ์ตœ์†Œ ์˜ˆ์ œ์ž…๋‹ˆ๋‹ค.
```python
from transformers import pipeline
from PIL import Image
import requests
img_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/cats.png"
image = Image.open(requests.get(img_url, stream=True).raw)
clf = pipeline(
task="image-classification",
model="dsaint31/tmp-pl-image-classification",
)
print(clf(image))
```
[More Information Needed]
## Training Details
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
* ๋ณธ repo์—์„œ๋Š” **์ถ”๊ฐ€ ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.**
* ์›๋ณธ ๋ชจ๋ธ์€ ImageNet-21k๋กœ ์‚ฌ์ „ํ•™์Šต(pretraining) ํ›„ ImageNet-1k๋กœ fine-tuning๋œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
### Training Procedure
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
#### Preprocessing [optional]
* ์›๋ณธ ViT ๋ชจ๋ธ์˜ ๊ธฐ๋ณธ ์ด๋ฏธ์ง€ ์ „์ฒ˜๋ฆฌ(Image Processor)๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
[More Information Needed]
#### Training Hyperparameters
- **Training regime:** ํ•ด๋‹น ์—†์Œ(ํ•™์Šต ๋ฏธ์ˆ˜ํ–‰) [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
#### Speeds, Sizes, Times [optional]
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Data, Factors & Metrics
#### Testing Data
<!-- This should link to a Dataset Card if possible. -->
* ๋ณธ repo์—์„œ๋Š” ๋ณ„๋„์˜ ํ‰๊ฐ€๋ฅผ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.
* ์„ฑ๋Šฅ ์ง€ํ‘œ๋Š” ์›๋ณธ ๋ชจ๋ธ ์นด๋“œ์˜ ํ‰๊ฐ€ ๊ฒฐ๊ณผ๋ฅผ ์ฐธ๊ณ ํ•˜์‹ญ์‹œ์˜ค.
[More Information Needed]
#### Factors
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
[More Information Needed]
#### Metrics
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
[More Information Needed]
### Results
[More Information Needed]
#### Summary
## Model Examination [optional]
<!-- Relevant interpretability work for the model goes here -->
[More Information Needed]
## Environmental Impact
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
๋ณธ repo์—์„œ๋Š” ํ•™์Šต์„ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์•˜์œผ๋ฏ€๋กœ ์ถ”๊ฐ€์ ์ธ ํ™˜๊ฒฝ์  ์˜ํ–ฅ์€ ์—†์Šต๋‹ˆ๋‹ค.
* ์›๋ณธ ๋ชจ๋ธ ํ•™์Šต์— ๋Œ€ํ•œ ํ™˜๊ฒฝ ์˜ํ–ฅ์€ base model ๋ฌธ์„œ๋ฅผ ์ฐธ๊ณ ํ•˜์‹ญ์‹œ์˜ค.
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
- **Hardware Type:** [More Information Needed]
- **Hours used:** [More Information Needed]
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]
## Technical Specifications [optional]
### Model Architecture and Objective
* Vision Transformer (ViT-Base, patch size 16, input resolution 224x224)
* Objective: Image classification
### Compute Infrastructure
[More Information Needed]
#### Hardware
* ํ•ด๋‹น์—†์Œ (ํ•™์Šต ๋ฏธ์ˆ˜ํ–‰)
#### Software
* Transformers
* Pillow
* PyTorch
[More Information Needed]
## Citation [optional]
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
์›๋ณธ ๋ชจ๋ธ ์ธ์šฉ ์‹œ ์•„๋ž˜ ๋…ผ๋ฌธ์„ ์ฐธ๊ณ ํ•˜์‹ญ์‹œ์˜ค.
**BibTeX:**
```bibtex
@article{dosovitskiy2020image,
title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
author={Dosovitskiy, Alexey and others},
journal={arXiv preprint arXiv:2010.11929},
year={2020}
}
```
**APA:**
[More Information Needed]
## Glossary [optional]
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
[More Information Needed]
## More Information [optional]
[More Information Needed]
## Model Card Authors [optional]
* dsaint31 (pipeline practice repository)
## Model Card Contact
* Hugging Face profile: [https://huggingface.co/dsaint31](https://huggingface.co/dsaint31)