Spaces:

dixisouls
/

image-captioning-api

Sleeping

App Files Files Community

image-captioning-api / README.md

dixisouls

README.md

77684e5 10 months ago

preview code

raw

history blame contribute delete

1.93 kB

	---
	title: Image Captioning Api
	emoji: 😻
	colorFrom: indigo
	colorTo: blue
	sdk: docker
	pinned: false
	license: mit
	short_description: API Endpoint for Image Captioning
	---

	# Image Captioning API

	A RESTful API for generating captions from images using a Transformer-based
	model. This service is designed to be deployed on Hugging Face Spaces.

	## Features

	- Upload any image file (jpg, png, etc.)
	- Get AI-generated captions based on image content
	- FastAPI-based REST API with documentation

	## API Endpoints

	- `GET /` - API information and usage
	- `POST /generate` - Upload an image and get a caption
	- `GET /health` - Health check endpoint
	- `GET /docs` - Swagger UI documentation

	## How to Use

	### API Request Example

	```bash
	curl -X POST "https://your-space-name.hf.space/generate" \
	-H "accept: application/json" \
	-H "Content-Type: multipart/form-data" \
	-F "image=@your_image.jpg" \
	-F "max_length=20"
	```

	### API Response Example

	```json
	{
	"caption": "a person riding a snowboard down a snow covered slope",
	"image": "base64_encoded_image_data..."
	}
	```

	## Local Development

	### Prerequisites

	- Python 3.9+
	- pip

	### Setup

	1. Clone the repository
	2. Install dependencies:
	```
	pip install -r requirements.txt
	```
	3. Run the application:
	```
	python app.py
	```
	4. Visit http://localhost:7860/docs to access the API documentation

	## Deployment on Hugging Face Spaces

	This application is designed to be deployed on
	[Hugging Face Spaces](https://huggingface.co/spaces) using Docker.

	1. Create a new Space on Hugging Face
	2. Select Docker as the SDK
	3. Upload all files to the repository
	4. Hugging Face will automatically build and deploy the application

	## Technical Details

	- Model: ResNet50 encoder with Transformer decoder
	- Framework: PyTorch
	- API: FastAPI
	- Image Processing: torchvision and PIL
	- Model Hosting: Hugging Face Hub

	## License

	MIT