Commit
·
e3cee23
1
Parent(s):
9b1bf94
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,8 +1,10 @@
|
|
| 1 |
|
| 2 |
# Visual semantic with BERT-CNN
|
|
|
|
|
|
|
| 3 |
|
| 4 |
This model can be used to assign an object-to-caption relatedness score, which is valuable for
|
| 5 |
-
(1) caption diverse re-ranking, and (2) generate soft labels for caption filtering when scraping text-to-captions from the internet.
|
| 6 |
|
| 7 |
The model is trained with a strict filter of 0.4 similarity distance thresholds between the object and its related caption.
|
| 8 |
|
|
|
|
| 1 |
|
| 2 |
# Visual semantic with BERT-CNN
|
| 3 |
+
To take advantage of the overlapping between the visual context and the caption, and to extract global information from each visual, we use BERT as an embedding layer followed by a shallow CNN (tri-gram kernel) (Kim,204).
|
| 4 |
+
|
| 5 |
|
| 6 |
This model can be used to assign an object-to-caption relatedness score, which is valuable for
|
| 7 |
+
(1) caption diverse re-ranking, and (2) generate soft labels for caption filtering when scraping text-to-captions from the internet.
|
| 8 |
|
| 9 |
The model is trained with a strict filter of 0.4 similarity distance thresholds between the object and its related caption.
|
| 10 |
|