AyoubChLin
/

bertopic_cnn_news

CNN news articles

Model card Files Files and versions

AyoubChLin commited on Apr 5, 2023

Commit

a20a891

·

1 Parent(s): b788199

Update README.md

Files changed (1) hide show

README.md +58 -0

README.md CHANGED Viewed

@@ -1,3 +1,61 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
+datasets:
+- AyoubChLin/CNN_News_Articles_2011-2022
+language:
+- en
+tags:
+- topic modeling
+- BERT
+- CNN news articles
 ---
+# BERTopic Model for CNN News Articles
+This model is a BERTopic model fine-tuned on CNN news articles. It uses the sentence transformer model "all-MiniLM-L6-v2" to encode the sentences and UMAP for dimensionality reduction.
+## Usage
+First, install the required packages:
+```console
+pip install sentence_transformers umap-learn bertopic
+```
+``` python
+Then, load the model and encode your documents:
+```python
+from sentence_transformers import SentenceTransformer
+from umap import UMAP
+from bertopic import BERTopic
+# Load the sentence transformer model
+sentence_model = SentenceTransformer("all-MiniLM-L6-v2")
+# Set the random state in the UMAP model to prevent stochastic behavior
+umap_model = UMAP(n_neighbors=15, n_components=5,  min_dist=0.0, metric='cosine', random_state=42)
+# Load the BERTopic model
+my_model = BERTopic.load("from/path/model.bin")
+# Encode your documents
+document_embeddings = sentence_model.encode(documents)
+```
+# predict :
+```python
+sentences = "my sentence"
+embeddings = sentence_model.encode([sentences])
+topic , _ =my_model.transform([sentences],embeddings)
+```
+For more information on how to use the BERTopic model, see the (BERTopic documentation)[https://maartengr.github.io/BERTopic/index.html].