File size: 727 Bytes
5c9209a
 
1d5e4e7
 
 
 
5c9209a
5fd71dc
 
 
 
a94c894
 
 
1d5e4e7
a94c894
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
license: mit
datasets:
- tatoeba
language:
- pt
---
## Introduction
BERTopic-Tatoeba-PT is a topic model based on [BERTopic](https://maartengr.github.io/BERTopic/index.html) (Grootendorst [2022]) with default parameters, using [Tatoeba](https://tatoeba.org/en/) sentences in Portuguese with English translations as documents. BERTopic-Tatoeba-PT was developed in the context of the Master's thesis "Learning What to Learn: Generating Language Lessons using BERT", whose repository with code and text is available on [Github](https://github.com/joaoDossena/MasterThesis).

## Usage
```python
!pip install bertopic
from bertopic import BERTopic

# Load model
topic_model = BERTopic.load("bertopic_portuguese_tatoeba_5k")
```