AhmedSSabir
/

BERT-CNN-Visual-Semantic

Model card Files Files and versions

AhmedSSabir commited on Jun 9, 2022

Commit

e3cee23

·

1 Parent(s): 9b1bf94

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -1,8 +1,10 @@
 # Visual semantic with BERT-CNN
 This model can be used to assign an object-to-caption relatedness score, which is valuable for
-(1) caption diverse re-ranking, and (2) generate soft labels for caption filtering when scraping text-to-captions from the internet.
 The model is trained with a strict filter of 0.4 similarity distance thresholds between the object and its related caption.

 # Visual semantic with BERT-CNN
+ To take advantage of the overlapping between the visual context and the caption, and to extract global information from each visual, we use BERT  as an embedding layer followed by a shallow CNN (tri-gram kernel) (Kim,204).
 This model can be used to assign an object-to-caption relatedness score, which is valuable for
+(1) caption diverse re-ranking, and (2) generate soft labels for caption filtering when scraping text-to-captions from the internet.
 The model is trained with a strict filter of 0.4 similarity distance thresholds between the object and its related caption.