florentgbelidji
/

blip_captioning

image-captioning

endpoints-template

Model card Files Files and versions

blip_captioning / README.md

florentgbelidji's picture

florentgbelidji

Updated README

1d4aaa3 about 3 years ago

|

2.47 kB

	---
	tags:
	- image-to-text
	- image-captioning
	- endpoints-template
	license: bsd-3-clause
	library_name: generic
	---

	# Fork of [salesforce/BLIP](https://github.com/salesforce/BLIP) for a `image-captioning` task on 🤗Inference endpoint.

	This repository implements a `custom` task for `image-captioning` for 🤗 Inference Endpoints. The code for the customized pipeline is in the [pipeline.py](https://huggingface.co/florentgbelidji/blip_captioning/blob/main/pipeline.py).
	To use deploy this model a an Inference Endpoint you have to select `Custom` as task to use the `pipeline.py` file. -> _double check if it is selected_
	### expected Request payload
	```json
	{
	"image": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgICAgMC....", // base64 image as bytes
	}
	```
	below is an example on how to run a request using Python and `requests`.
	## Run Request
	1. prepare an image.
	```bash
	!wget https://huggingface.co/datasets/mishig/sample_images/resolve/main/palace.jpg
	```
	2.run request

	```python
	import json
	from typing import List
	import requests as r
	import base64

	ENDPOINT_URL = ""
	HF_TOKEN = ""

	def predict(path_to_image: str = None):
	with open(path_to_image, "rb") as i:
	image = i.read()
	payload = {
	"inputs": [image],
	"parameters": {
	"do_sample": True,
	"top_p":0.9,
	"min_length":5,
	"max_length":20
	}
	}
	response = r.post(
	ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload
	)
	return response.json()
	prediction = predict(
	path_to_image="palace.jpg"
	)

	```
	Example parameters depending on the decoding strategy:

	1. Beam search

	```
	"parameters": {
	"num_beams":5,
	"max_length":20
	}
	```

	2. Nucleus sampling

	```
	"parameters": {
	"num_beams":1,
	"max_length":20,
	"do_sample": True,
	"top_k":50,
	"top_p":0.95
	}
	```

	3. Contrastive search

	```
	"parameters": {
	"penalty_alpha":0.6,
	"top_k":4
	"max_length":512
	}
	```

	See [generate()](https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/text_generation#transformers.GenerationMixin.generate) doc for additional detail


	expected output
	```python
	['buckingham palace with flower beds and red flowers']
	```