tanaos-topic-classification-v1: A small but performant topic classification model
This model was created by Tanaos with the Artifex Python library.
This is a topic classification model based on FacebookAI/roberta-base and fine-tuned on a synthetic dataset to classify text into one of 15 different intent categories:
| Topic | Description |
|---|---|
politics |
elections, policies, scandals, ideology. |
health |
physical health, mental health, fitness, diets, medical advice. |
technology |
gadgets, software, AI, cybersecurity. |
entertainment |
movies, TV shows, music, celebrities, streaming platforms. |
money_finance |
investing, budgeting, crypto, real estate. |
relationships_dating |
romance, breakups, marriage, family drama. |
education_learning |
schools, universities, self-study, online courses., |
work_careers |
job hunting, workplace culture, remote work, career advice. |
science |
research, space, climate, biology, physics, chemistry and the scientific method. |
society_culture |
identity, inequality, norms, language, and society. |
gaming |
video games, esports, hardware, mods, and gaming culture. |
lifestyle_hobbies |
travel, food, fashion, DIY, productivity systems. |
sports |
teams, athletes, events, scores, and sports culture. |
automotive |
cars, motorcycles, reviews, maintenance, and industry news. |
other |
miscellaneous topics not covered by the other categories. |
How to Use
Use this model through the Artifex library:
install Artifex with
pip install artifex
use the model with
from artifex import Artifex
topic_classification = Artifex().topic_classification()
topic = topic_classification("What do you think about the latest AI advancements?")
print(topic)
# >>> [{'label': 'technology', 'score': 0.9910}]
Model Description
- Base model:
FacebookAI/roberta-base - Task: Text classification (topic classification)
- Languages: English
- Fine-tuning data: A synthetic, custom dataset of 10,000 utterances, each belonging to one of 15 different topic categories.
Training Details
This model was trained using the Artifex Python library
pip install artifex
by providing the following instructions and generating 10,000 synthetic training samples:
from artifex import Artifex
topic_classification = Artifex().topic_classification()
topic_classification.train(
domain="general",
classes={
"politics": "elections, policies, scandals, ideology",
"health": "physical health, mental health, fitness, diets, medical advice.",
"technology": "gadgets, software, AI, cybersecurity.",
"entertainment": "movies, TV shows, music, celebrities, streaming platforms.",
"money_finance": "investing, budgeting, crypto, real estate.",
"relationships_dating": "romance, breakups, marriage, family drama.",
"education_learning": "schools, universities, self-study, online courses.",
"work_careers": "job hunting, workplace culture, remote work, career advice.",
"science": "research, space, climate, biology, physics, chemistry and the scientific method.",
"society_culture": "identity, inequality, norms, language, and society.",
"gaming": "video games, esports, hardware, mods, and gaming culture.",
"lifestyle_hobbies": "travel, food, fashion, DIY, productivity systems.",
"sports": "teams, athletes, events, scores, and sports culture.",
"automotive": "cars, motorcycles, reviews, maintenance, and industry news.",
"other": "miscellaneous topics not covered by the other categories."
},
num_samples=10000
)
Intended Uses
This model is intended to:
- Classify conversations, reviews, articles, or any text into one of the predefined topic categories.
- Be used in applications such as chatbots, content categorization, and sentiment analysis.
- Serve as a lightweight alternative for topic classification tasks.
Not intended for:
- Use cases requiring extremely high accuracy or domain-specific knowledge without further fine-tuning.
- Downloads last month
- 785
Model tree for tanaos/tanaos-topic-classification-v1
Base model
FacebookAI/roberta-base