Fork of Salesforce/blip-image-captioning-large for a image-captioning task on 🤗Inference endpoint.
This repository implements a custom task for image-captioning for 🤗 Inference Endpoints. The code for the customized pipeline is in the pipeline.py.
To use deploy this model a an Inference Endpoint you have to select Custom as task to use the handler.py file. -> double check if it is selected
expected Request payload
{
"image": "/9j/4AAQSkZJRgA.....", #encoded image
"text": "a photography of a"
}
below is an example on how to run a request using Python and requests.
Run Request
- Use any online image.
!wget https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg
2.run request
import json
from typing import List
import requests as r
import base64
with open("/content/demo.jpg", "rb") as image_file:
encoded_string = base64.b64encode(image_file.read()).decode()
ENDPOINT_URL = ""
HF_TOKEN = ""
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({
"inputs": {
"images": [encoded_string], # using the base64 encoded string
"texts": ["a photography of"] # Optional, based on your current class logic
}
})
print(output)
Example parameters depending on the decoding strategy:
- Beam search
"parameters": {
"num_beams":5,
"max_length":20
}
- Nucleus sampling
"parameters": {
"num_beams":1,
"max_length":20,
"do_sample": True,
"top_k":50,
"top_p":0.95
}
- Contrastive search
"parameters": {
"penalty_alpha":0.6,
"top_k":4
"max_length":512
}
See generate() doc for additional detail
expected output
{'captions': ['a photography of a woman and her dog on the beach']}
- Downloads last month
- 5