| |
|
| | --- |
| | pipeline_tag: text-classification |
| | library_name: turftopic |
| | tags: |
| | - turftopic |
| | - topic-modelling |
| | --- |
| | |
| | # kardosdrur/testing_s3 |
| | |
| | This repository contains a topic model trained with the [Turftopic](https://github.com/x-tabdeveloping/turftopic) Python library. |
| | |
| | To load and use the model run the following piece of code: |
| | |
| | ```python |
| | from turftopic import load_model |
| |
|
| | model = load_model(kardosdrur/testing_s3) |
| | model.print_topics() |
| | ``` |
| | |
| | ## Model Structure |
| | |
| | The model is structured as follows: |
| | |
| | ``` |
| | ClusteringTopicModel(clustering=KMeans(n_clusters=20), |
| | dimensionality_reduction=PCA(n_components=5), |
| | feature_importance='c-tf-idf', |
| | vectorizer=CountVectorizer(min_df=10, |
| | stop_words='english')) |
| | ``` |
| | |
| | ## Topics |
| | The topics discovered by the model are the following: |
| |
|
| | | Topic ID | Highest Ranking | |
| | | - | - | |
| | | 0 | ax, max, g9v, b8f, jpeg, pl, a86, db, 1d9, file | |
| | | 1 | drive, scsi, price, card, sale, 00, shipping, ram, pc, offer | |
| | | 2 | pathetic, path, patient, patience, paths, pathology, patrick, patent, patently, patriot | |
| | | 3 | key, encryption, government, clipper, chip, keys, law, use, nsa, escrow | |
| | | 4 | people, right, don, think, just, government, like, say, does, rights | |
| | | 5 | game, team, year, 25, play, games, players, 10, 55, season | |
| | | 6 | dos, windows, image, file, edu, ftp, version, files, available, program | |
| | | 7 | god, jesus, bible, people, christ, believe, christians, christian, faith, say | |
| | | 8 | mr, president, people, fbi, gun, think, did, don, batf, know | |
| | | 9 | space, use, new, launch, used, like, don, know, just, 00 | |
| | | 10 | god, jews, people, church, does, did, christian, greek, just, israel | |
| | | 11 | car, just, like, don, people, think, money, insurance, make, time | |
| | | 12 | software, windows, thanks, know, version, does, ftp, available, xfree86, pc | |
| | | 13 | ax, edu, information, pub, space, ftp, data, mail, file, entry | |
| | | 14 | hockey, game, games, team, season, nhl, la, league, don, pts | |
| | | 15 | armenian, armenians, turkish, people, said, israel, jews, genocide, israeli, armenia | |
| | | 16 | 00, car, new, 50, price, bike, good, like, 1st, 10 | |
| | | 17 | like, just, time, problem, don, use, know, vitamin, good, think | |
| | | 18 | drive, scsi, card, disk, windows, controller, drives, use, bus, ide | |
| | | 19 | ax, max, edu, com, b8f, ah, 145, a86, pl, air | |
| |
|
| | ## Package versions |
| |
|
| | The model in this repo was trained using the following package versions: |
| |
|
| | | Package | Version | |
| | | - | - | |
| | | scikit-learn | 1.3.2 | |
| | | sentence-transformers | 3.2.0 | |
| | | turftopic | 0.6.0 | |
| | | joblib | 1.2.0 | |
| |
|
| | We recommend that you install the same, or compatible versions of these packages locally, before trying to load a model. |
| |
|
| |
|