Spaces:
Sleeping
Sleeping
File size: 2,107 Bytes
18ac5a7 90d6a57 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
---
license: mit
title: Image_captioning
sdk: streamlit
emoji: 🚀
colorFrom: red
colorTo: red
pinned: true
---
# Image Captioning App
This Streamlit app uses a pre-trained model to generate captions for uploaded images.
## Challenges Faced
1. **Image Processing**: Ensuring correct image preprocessing to match the model's input requirements.
2. **Tensor Conversion**: Handling the conversion of image data to the appropriate tensor format.
3. **Error Handling**: Implementing error handling and logging
## Models Used
The app uses the following pre-trained model from Hugging Face:
- **Model**: `nlpconnect/vit-gpt2-image-captioning`
- **Architecture**: Vision Encoder-Decoder Model
- **Vision Encoder**: ViT (Vision Transformer)
- **Text Decoder**: GPT-2
## Steps for Deployment
1. **Set up the environment**:
```
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
pip install -r requirements.txt
```
2. **Prepare the files**:
- Ensure `app.py`, `image_to_text.py`, and `requirements.txt` are in the project directory.
3. **Run the app locally**:
```
streamlit run app.py
```
4. **Deploy to Streamlit Cloud** (optional):
- Push your code to a GitHub repository.
- Connect your GitHub account to Streamlit Cloud.
- Select the repository and branch to deploy.
- Configure the app settings and deploy.
5. **Alternative Deployment Options**:
- Deploy to Heroku using a Procfile and runtime.txt.
- Use Docker to containerize the app for deployment on platforms like AWS, Google Cloud, or Azure.
## Requirements
See `requirements.txt` for a full list of dependencies. Key libraries include:
- streamlit
- torch
- transformers
- Pillow
- numpy
## Usage
1. Run the Streamlit app.
2. Upload an image using the file uploader.
3. Click the "Generate Caption" button.
4. View the generated caption below the image.
## Future Improvements
- Implement multiple model options for comparison.
- Use better models
- Add support for batch processing of images.
- Improve the UI with additional styling and user feedback. |