MaartenGr
/

BERTopic_Multimodal

BERTopic

Model card Files Files and versions

xet

Community

MaartenGr commited on May 30, 2023

Commit

af53722

1 Parent(s): f1a31f1

Update readme

Browse files

Files changed (1) hide show

README.md +66 -0

README.md CHANGED Viewed

@@ -10,6 +10,12 @@ library_name: bertopic
 This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
 BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
 ## Usage
 To use this model, please install BERTopic:
@@ -27,6 +33,12 @@ topic_model = BERTopic.load("MaartenGr/BERTopic_Multimodal")
 topic_model.get_topic_info()
 ```
 ## Topic overview
 * Number of topics: 29
@@ -69,6 +81,60 @@ topic_model.get_topic_info()
 </details>
 ## Training hyperparameters
 * calculate_probabilities: False

 This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
 BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
+This model was trained on 8000 images from Flickr **without** the captions. This demonstrates how BERTopic can be used for topic modeling using images as input only.
+A few examples of generated topics:
+!["multimodal.png"](multimodal.png)
 ## Usage
 To use this model, please install BERTopic:
 topic_model.get_topic_info()
 ```
+You can view all information about a topic as follows:
+```python
+topic_model.get_topic(topic_id, full=True)
+```
 ## Topic overview
 * Number of topics: 29
 </details>
+## Training Procedure
+The data was retrieved as follows:
+```python
+import os
+import glob
+import zipfile
+import numpy as np
+import pandas as pd
+from tqdm import tqdm
+from sentence_transformers import util
+# Flickr 8k images
+img_folder = 'photos/'
+caps_folder = 'captions/'
+if not os.path.exists(img_folder) or len(os.listdir(img_folder)) == 0:
+    os.makedirs(img_folder, exist_ok=True)
+    if not os.path.exists('Flickr8k_Dataset.zip'):   #Download dataset if does not exist
+        util.http_get('https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_Dataset.zip', 'Flickr8k_Dataset.zip')
+        util.http_get('https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_text.zip', 'Flickr8k_text.zip')
+    for folder, file in [(img_folder, 'Flickr8k_Dataset.zip'), (caps_folder, 'Flickr8k_text.zip')]:
+        with zipfile.ZipFile(file, 'r') as zf:
+            for member in tqdm(zf.infolist(), desc='Extracting'):
+                zf.extract(member, folder)
+images = list(glob.glob('photos/Flicker8k_Dataset/*.jpg'))
+```
+Then, to perform topic modeling on multimodal data with BERTopic:
+```python
+from bertopic import BERTopic
+from bertopic.backend import MultiModalBackend
+from bertopic.representation import VisualRepresentation, KeyBERTInspired
+# Image embedding model
+embedding_model = MultiModalBackend('clip-ViT-B-32', batch_size=32)
+# Image to text representation model
+representation_model = {
+    "Visual_Aspect": VisualRepresentation(image_to_text_model="nlpconnect/vit-gpt2-image-captioning", image_squares=True),
+    "KeyBERT": KeyBERTInspired()
+}
+# Train our model with images only
+topic_model = BERTopic(representation_model=representation_model, verbose=True, embedding_model=embedding_model, min_topic_size=30)
+topics, probs = topic_model.fit_transform(documents=None, images=images)
+```
+The above demonstrates that the input were only images. These images are clustered and from those clusters a small subset of representative images are extracted. The representative images are captioned using `"nlpconnect/vit-gpt2-image-captioning"` to generate a small textual dataset over which we can run c-TF-IDF and the additional
+`KeyBERTInspired` representation model.
 ## Training hyperparameters
 * calculate_probabilities: False