witron-image-captioning

Image-to-Text

generic

image-captioning

endpoints-template

Model card Files Files and versions

xet

Community

SlowPacer commited on Jun 9, 2023

Commit

12d3cce

1 Parent(s): c1fd285

Update README.md

Browse files

Files changed (1) hide show

README.md +10 -83

README.md CHANGED Viewed

@@ -5,94 +5,21 @@ tags:
 - endpoints-template
 license: bsd-3-clause
 library_name: generic
-duplicated_from: florentgbelidji/blip_captioning
 ---
-# Fork of [salesforce/BLIP](https://github.com/salesforce/BLIP) for a `image-captioning` task on 🤗Inference endpoint.
-This repository implements a `custom` task for `image-captioning` for 🤗 Inference Endpoints.
-To use deploy this model a an Inference Endpoint you have to select `Custom` as task to use the `pipeline.py` file. -> _double check if it is selected_
-### expected Request payload
 ```json
 {
-  "image": "/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgICAgMC....", // base64 image as bytes
 }
 ```
-below is an example on how to run a request using Python and `requests`.
-## Run Request
-1. prepare an image.
-```bash
-!wget https://huggingface.co/datasets/mishig/sample_images/resolve/main/palace.jpg
-```
-2.run request
-```python
-import json
-from typing import List
-import requests as r
-import base64
-ENDPOINT_URL = ""
-HF_TOKEN = ""
-def predict(path_to_image: str = None):
-    with open(path_to_image, "rb") as i:
-        image = i.read()
-    payload = {
-        "inputs": [image],
-        "parameters": {
-                   "do_sample": True,
-                   "top_p":0.9,
-                   "min_length":5,
-                   "max_length":20
-        }
-    }
-    response = r.post(
-        ENDPOINT_URL, headers={"Authorization": f"Bearer {HF_TOKEN}"}, json=payload
-    )
-    return response.json()
-prediction = predict(
-    path_to_image="palace.jpg"
-)
-```
-Example parameters depending on the decoding strategy:
-1. Beam search
-```
-        "parameters": {
-                   "num_beams":5,
-                   "max_length":20
-        }
-```
-2. Nucleus sampling
-```
-        "parameters": {
-                   "num_beams":1,
-                   "max_length":20,
-                   "do_sample": True,
-                   "top_k":50,
-                   "top_p":0.95
-        }
-```
-3. Contrastive search
-```
-        "parameters": {
-                   "penalty_alpha":0.6,
-                   "top_k":4
-                   "max_length":512
-        }
-```
-See [generate()](https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/text_generation#transformers.GenerationMixin.generate) doc for additional detail
-expected output
-```python
-['buckingham palace with flower beds and red flowers']
-```

 - endpoints-template
 license: bsd-3-clause
 library_name: generic
 ---
+# Image captioning
+For deployment as an inference endpoint, using a Custom task type – a fixed version of [this repo](https://huggingface.co/florentgbelidji/blip_captioning) (updated to decode the base64 image strings)
+## Request payload
 ```json
 {
+  "inputs": ["/9j/4AAQSkZJRgABAQEBLAEsAAD/2wBDAAMCAgICAgMC...."], // base64-encoded image
 }
 ```
+## Response payload
+```json
+{
+  "captions": ["inferred caption for image"]
+}
+```