Spaces:
Sleeping
Sleeping
| license: mit | |
| title: Image_captioning | |
| sdk: streamlit | |
| emoji: ๐ | |
| colorFrom: red | |
| colorTo: red | |
| pinned: true | |
| # Image Captioning App | |
| This Streamlit app uses a pre-trained model to generate captions for uploaded images. | |
| ## Challenges Faced | |
| 1. **Image Processing**: Ensuring correct image preprocessing to match the model's input requirements. | |
| 2. **Tensor Conversion**: Handling the conversion of image data to the appropriate tensor format. | |
| 3. **Error Handling**: Implementing error handling and logging | |
| ## Models Used | |
| The app uses the following pre-trained model from Hugging Face: | |
| - **Model**: `nlpconnect/vit-gpt2-image-captioning` | |
| - **Architecture**: Vision Encoder-Decoder Model | |
| - **Vision Encoder**: ViT (Vision Transformer) | |
| - **Text Decoder**: GPT-2 | |
| ## Steps for Deployment | |
| 1. **Set up the environment**: | |
| ``` | |
| python -m venv venv | |
| source venv/bin/activate # On Windows, use `venv\Scripts\activate` | |
| pip install -r requirements.txt | |
| ``` | |
| 2. **Prepare the files**: | |
| - Ensure `app.py`, `image_to_text.py`, and `requirements.txt` are in the project directory. | |
| 3. **Run the app locally**: | |
| ``` | |
| streamlit run app.py | |
| ``` | |
| 4. **Deploy to Streamlit Cloud** (optional): | |
| - Push your code to a GitHub repository. | |
| - Connect your GitHub account to Streamlit Cloud. | |
| - Select the repository and branch to deploy. | |
| - Configure the app settings and deploy. | |
| 5. **Alternative Deployment Options**: | |
| - Deploy to Heroku using a Procfile and runtime.txt. | |
| - Use Docker to containerize the app for deployment on platforms like AWS, Google Cloud, or Azure. | |
| ## Requirements | |
| See `requirements.txt` for a full list of dependencies. Key libraries include: | |
| - streamlit | |
| - torch | |
| - transformers | |
| - Pillow | |
| - numpy | |
| ## Usage | |
| 1. Run the Streamlit app. | |
| 2. Upload an image using the file uploader. | |
| 3. Click the "Generate Caption" button. | |
| 4. View the generated caption below the image. | |
| ## Future Improvements | |
| - Implement multiple model options for comparison. | |
| - Use better models | |
| - Add support for batch processing of images. | |
| - Improve the UI with additional styling and user feedback. |