tifa-benchmark
/

promptcap-coco-vqa

visual-question-answering

image-captioning

Model card Files Files and versions

yushihu commited on Dec 11, 2023

Commit

de7d203

·

1 Parent(s): 4101b50

Update README.md

Files changed (1) hide show

README.md +4 -3

README.md CHANGED Viewed

@@ -16,7 +16,8 @@ language:
 - en
 ---
-This is the repo for the paper [PromptCap: Prompt-Guided Task-Aware Image Captioning](https://arxiv.org/abs/2211.09699)
 We introduce PromptCap, a captioning model that can be controlled by natural language instruction. The instruction may contain a question that the user is interested in.
 For example, "what is the boy putting on?". PromptCap also supports generic caption, using the question "what does the image describe?"
@@ -43,7 +44,7 @@ Generate a prompt-guided caption by following:
 import torch
 from promptcap import PromptCap
-model = PromptCap("vqascore/promptcap-coco-vqa")  # also support OFA checkpoints. e.g. "OFA-Sys/ofa-large"
 if torch.cuda.is_available():
   model.cuda()
@@ -87,7 +88,7 @@ import torch
 from promptcap import PromptCap_VQA
 # QA model support all UnifiedQA variants. e.g. "allenai/unifiedqa-v2-t5-large-1251000"
-vqa_model = PromptCap_VQA(promptcap_model="vqascore/promptcap-coco-vqa", qa_model="allenai/unifiedqa-t5-base")
 if torch.cuda.is_available():
   vqa_model.cuda()

 - en
 ---
+This is the repo for the paper [PromptCap: Prompt-Guided Task-Aware Image Captioning](https://arxiv.org/abs/2211.09699). This paper is accepted to ICCV 2023 as [PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3](https://openaccess.thecvf.com/content/ICCV2023/html/Hu_PromptCap_Prompt-Guided_Image_Captioning_for_VQA_with_GPT-3_ICCV_2023_paper.html).
 We introduce PromptCap, a captioning model that can be controlled by natural language instruction. The instruction may contain a question that the user is interested in.
 For example, "what is the boy putting on?". PromptCap also supports generic caption, using the question "what does the image describe?"
 import torch
 from promptcap import PromptCap
+model = PromptCap("tifa-benchmark/promptcap-coco-vqa")  # also support OFA checkpoints. e.g. "OFA-Sys/ofa-large"
 if torch.cuda.is_available():
   model.cuda()
 from promptcap import PromptCap_VQA
 # QA model support all UnifiedQA variants. e.g. "allenai/unifiedqa-v2-t5-large-1251000"
+vqa_model = PromptCap_VQA(promptcap_model="tifa-benchmark/promptcap-coco-vqa", qa_model="allenai/unifiedqa-t5-base")
 if torch.cuda.is_available():
   vqa_model.cuda()