metadata
license: mit
language:
- en
base_model:
- google/vit-base-patch16-224
- FacebookAI/roberta-base
tags:
- comics
- composition
- comic
- comic-analysis
- page
- fusion
The model code and documentation repository is at https://github.com/RichardScottOZ/comic-analysis
Using transformers multimodal fusion of image and text to make embeddings to query comics for similarity or text.
More more detail the repo above.