tanaos-topic-classification-v1: A small but performant topic classification model
This model was created by Tanaos with the Artifex Python library.
This is a topic classification model based on FacebookAI/roberta-base and fine-tuned on a synthetic dataset to classify text into one of 15 different intent categories:
| Topic | Description |
|---|---|
politics |
elections, policies, scandals, ideology. |
health |
physical health, mental health, fitness, diets, medical advice. |
technology |
gadgets, software, AI, cybersecurity. |
entertainment |
movies, TV shows, music, celebrities, streaming platforms. |
money_finance |
investing, budgeting, crypto, real estate. |
relationships_dating |
romance, breakups, marriage, family drama. |
education_learning |
schools, universities, self-study, online courses., |
work_careers |
job hunting, workplace culture, remote work, career advice. |
science |
research, space, climate, biology, physics, chemistry and the scientific method. |
society_culture |
identity, inequality, norms, language, and society. |
gaming |
video games, esports, hardware, mods, and gaming culture. |
lifestyle_hobbies |
travel, food, fashion, DIY, productivity systems. |
sports |
teams, athletes, events, scores, and sports culture. |
automotive |
cars, motorcycles, reviews, maintenance, and industry news. |
other |
miscellaneous topics not covered by the other categories. |
How to Use
Via the Artifex library (pip install artifex)
from artifex import Artifex
topic_classification = Artifex().topic_classification
print(topic_classification("What do you think about the latest AI advancements?"))
# >>> [{'label': 'technology', 'score': 0.9910}]
Via the Transformers library
from transformers import pipeline
clf = pipeline("text-classification", model="tanaos/tanaos-topic-classification-v1")
print(clf("What do you think about the latest AI advancements?"))
# >>> [{'label': 'technology', 'score': 0.9910}]
Model Description
- Base model:
FacebookAI/roberta-base - Task: Text classification (topic classification)
- Languages: English
- Fine-tuning data: A synthetic, custom dataset of 10,000 utterances, each belonging to one of 15 different topic categories.
Training Details
This model was trained using the Artifex Python library
pip install artifex
by providing the following instructions and generating 10,000 synthetic training samples:
from artifex import Artifex
topic_classification = Artifex().topic_classification
topic_classification.train(
domain="general",
classes={
"politics": "elections, policies, scandals, ideology",
"health": "physical health, mental health, fitness, diets, medical advice.",
"technology": "gadgets, software, AI, cybersecurity.",
"entertainment": "movies, TV shows, music, celebrities, streaming platforms.",
"money_finance": "investing, budgeting, crypto, real estate.",
"relationships_dating": "romance, breakups, marriage, family drama.",
"education_learning": "schools, universities, self-study, online courses.",
"work_careers": "job hunting, workplace culture, remote work, career advice.",
"science": "research, space, climate, biology, physics, chemistry and the scientific method.",
"society_culture": "identity, inequality, norms, language, and society.",
"gaming": "video games, esports, hardware, mods, and gaming culture.",
"lifestyle_hobbies": "travel, food, fashion, DIY, productivity systems.",
"sports": "teams, athletes, events, scores, and sports culture.",
"automotive": "cars, motorcycles, reviews, maintenance, and industry news.",
"other": "miscellaneous topics not covered by the other categories."
},
num_samples=10000
)
Intended Uses
This model is intended to:
- Classify conversations, reviews, articles, or any text into one of the predefined topic categories.
- Be used in applications such as chatbots, content categorization, and sentiment analysis.
- Serve as a lightweight alternative for topic classification tasks.
Not intended for:
- Use cases requiring extremely high accuracy or domain-specific knowledge without further fine-tuning.
- Downloads last month
- 26
Model tree for tanaos/tanaos-topic-classification-v1
Base model
FacebookAI/roberta-base