Spaces:
No application file
No application file
| title: ImageCaptionTestSpace | |
| emoji: π» | |
| colorFrom: blue | |
| colorTo: red | |
| sdk: gradio | |
| sdk_version: 5.49.0 | |
| app_file: app.py | |
| pinned: false | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |
| # Image Captioning (ViT-GPT2) β Hugging Face Space | |
| This Space serves an image captioning model using Hugging Face `VisionEncoderDecoderModel` (ViT + GPT-2). | |
| It runs out-of-the-box with the base model and can optionally load your **fine-tuned** weights. | |
| **Live app entrypoint:** `app.py` (Gradio) | |
| ## Quick Start (on Spaces) | |
| 1. Click **New Space** β **Gradio** β **Blank** β pick a free CPU or T4 small (GPU) runtime. | |
| 2. Upload all files from this repo. | |
| 3. (Optional) If you have fine-tuned weights: | |
| - Upload the saved folder to the Space (e.g., `outputs/caption_finetune/`) | |
| - Set a Space secret or environment variable: `MODEL_DIR = outputs/caption_finetune` | |
| - Alternatively push your weights to the Hub and set `MODEL_DIR = your-username/your-model-repo` | |
| If `MODEL_DIR` is not set, the app uses `nlpconnect/vit-gpt2-image-captioning`. | |
| ## Local Dev | |
| ```bash | |
| pip install -r requirements.txt | |
| python app.py | |
| # then open http://127.0.0.1:7860 | |