Update README.md
Browse files
README.md
CHANGED
|
@@ -16,7 +16,8 @@ language:
|
|
| 16 |
- en
|
| 17 |
|
| 18 |
---
|
| 19 |
-
This is the repo for the paper [PromptCap: Prompt-Guided Task-Aware Image Captioning](https://arxiv.org/abs/2211.09699)
|
|
|
|
| 20 |
|
| 21 |
We introduce PromptCap, a captioning model that can be controlled by natural language instruction. The instruction may contain a question that the user is interested in.
|
| 22 |
For example, "what is the boy putting on?". PromptCap also supports generic caption, using the question "what does the image describe?"
|
|
@@ -43,7 +44,7 @@ Generate a prompt-guided caption by following:
|
|
| 43 |
import torch
|
| 44 |
from promptcap import PromptCap
|
| 45 |
|
| 46 |
-
model = PromptCap("
|
| 47 |
|
| 48 |
if torch.cuda.is_available():
|
| 49 |
model.cuda()
|
|
@@ -87,7 +88,7 @@ import torch
|
|
| 87 |
from promptcap import PromptCap_VQA
|
| 88 |
|
| 89 |
# QA model support all UnifiedQA variants. e.g. "allenai/unifiedqa-v2-t5-large-1251000"
|
| 90 |
-
vqa_model = PromptCap_VQA(promptcap_model="
|
| 91 |
|
| 92 |
if torch.cuda.is_available():
|
| 93 |
vqa_model.cuda()
|
|
|
|
| 16 |
- en
|
| 17 |
|
| 18 |
---
|
| 19 |
+
This is the repo for the paper [PromptCap: Prompt-Guided Task-Aware Image Captioning](https://arxiv.org/abs/2211.09699). This paper is accepted to ICCV 2023 as [PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3](https://openaccess.thecvf.com/content/ICCV2023/html/Hu_PromptCap_Prompt-Guided_Image_Captioning_for_VQA_with_GPT-3_ICCV_2023_paper.html).
|
| 20 |
+
|
| 21 |
|
| 22 |
We introduce PromptCap, a captioning model that can be controlled by natural language instruction. The instruction may contain a question that the user is interested in.
|
| 23 |
For example, "what is the boy putting on?". PromptCap also supports generic caption, using the question "what does the image describe?"
|
|
|
|
| 44 |
import torch
|
| 45 |
from promptcap import PromptCap
|
| 46 |
|
| 47 |
+
model = PromptCap("tifa-benchmark/promptcap-coco-vqa") # also support OFA checkpoints. e.g. "OFA-Sys/ofa-large"
|
| 48 |
|
| 49 |
if torch.cuda.is_available():
|
| 50 |
model.cuda()
|
|
|
|
| 88 |
from promptcap import PromptCap_VQA
|
| 89 |
|
| 90 |
# QA model support all UnifiedQA variants. e.g. "allenai/unifiedqa-v2-t5-large-1251000"
|
| 91 |
+
vqa_model = PromptCap_VQA(promptcap_model="tifa-benchmark/promptcap-coco-vqa", qa_model="allenai/unifiedqa-t5-base")
|
| 92 |
|
| 93 |
if torch.cuda.is_available():
|
| 94 |
vqa_model.cuda()
|