Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,28 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# FuseCap: Leveraging Large Language Models to Fuse Visual Data into Enriched Image Captions
|
| 2 |
+
|
| 3 |
+
A framework designed to generate semantically rich image captions.
|
| 4 |
+
|
| 5 |
+
## Resources
|
| 6 |
+
|
| 7 |
+
- 💻 **Project Page**: For more details, visit the official [project page](https://rotsteinnoam.github.io/FuseCap/).
|
| 8 |
+
|
| 9 |
+
- 📝 **Read the Paper**: You can find the paper [here](https://arxiv.org/abs/2305.17718).
|
| 10 |
+
|
| 11 |
+
- 🚀 **Demo**: Try out our BLIP-based model [demo](https://huggingface.co/spaces/noamrot/FuseCap) trained using FuseCap, hosted on Huggingface Spaces.
|
| 12 |
+
|
| 13 |
+
## Upcoming Updates
|
| 14 |
+
|
| 15 |
+
The official codebase and trained models for this project will be released soon.
|
| 16 |
+
|
| 17 |
+
## BibTeX
|
| 18 |
+
|
| 19 |
+
``` Citation
|
| 20 |
+
@misc{rotstein2023fusecap,
|
| 21 |
+
title={FuseCap: Leveraging Large Language Models to Fuse Visual Data into Enriched Image Captions},
|
| 22 |
+
author={Noam Rotstein and David Bensaid and Shaked Brody and Roy Ganz and Ron Kimmel},
|
| 23 |
+
year={2023},
|
| 24 |
+
eprint={2305.17718},
|
| 25 |
+
archivePrefix={arXiv},
|
| 26 |
+
primaryClass={cs.CV}
|
| 27 |
+
}
|
| 28 |
+
```
|