bitpoint
/

ImageAnalysis

bitpoint commited on Oct 18, 2024

Commit

c076c03

verified ·

1 Parent(s): 644b7ec

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md ADDED Viewed

+    # Model Card for vit-gpt2-image-captioning
+    ## Model Details
+    This model is a VisionEncoderDecoderModel using a ViT encoder and GPT-2 decoder to generate captions for images. It was fine-tuned by adding context information to assist in generating meaningful captions.
+    - **Base Model**: nlpconnect/vit-gpt2-image-captioning
+    - **Processor**: ViTImageProcessor
+    - **Tokenizer**: GPT-2 Tokenizer
+    - **Generated Caption Example**: "{generated_text}"
+    ## Intended Use
+    This model is intended for generating captions for stock-related images, with an initial context provided for more accurate descriptions.
+    ## Limitations
+    - The model might generate incorrect or biased descriptions depending on the input image or context.
+    - It requires specific context inputs for the best performance.
+    ## How to Use
+    ```python
+    from transformers import VisionEncoderDecoderModel, ViTImageProcessor, AutoTokenizer
+    model = VisionEncoderDecoderModel.from_pretrained("your_username/your_model_name")
+    processor = ViTImageProcessor.from_pretrained("your_username/your_model_name")
+    tokenizer = AutoTokenizer.from_pretrained("your_username/your_model_name")
+    ```
+    ## License
+    This model is licensed under the same terms as the original nlpconnect/vit-gpt2-image-captioning.