tanaos-topic-classification-v1: A small but performant topic classification model
Looking for a custom Topic Classification model, or any other task-specific Small Language Model fine-tuned to your specific needs? We will do it for you! https://tanaos.com/#try-it-out
This model was created by Tanaos with the Artifex Python library.
This is a topic classification model based on FacebookAI/roberta-base and fine-tuned on a synthetic dataset to classify text into one of 15 different intent categories:
| Topic | Description |
|---|---|
politics |
elections, policies, scandals, ideology. |
health |
physical health, mental health, fitness, diets, medical advice. |
technology |
gadgets, software, AI, cybersecurity. |
entertainment |
movies, TV shows, music, celebrities, streaming platforms. |
money_finance |
investing, budgeting, crypto, real estate. |
relationships_dating |
romance, breakups, marriage, family drama. |
education_learning |
schools, universities, self-study, online courses., |
work_careers |
job hunting, workplace culture, remote work, career advice. |
science |
research, space, climate, biology, physics, chemistry and the scientific method. |
society_culture |
identity, inequality, norms, language, and society. |
gaming |
video games, esports, hardware, mods, and gaming culture. |
lifestyle_hobbies |
travel, food, fashion, DIY, productivity systems. |
sports |
teams, athletes, events, scores, and sports culture. |
automotive |
cars, motorcycles, reviews, maintenance, and industry news. |
other |
miscellaneous topics not covered by the other categories. |
How to Use
Use this model for free via the Tanaos API in 3 simple steps:
- Sign up for a free account at https://platform.tanaos.com/
- Create a free API Key from the API Keys section
- Replace
<YOUR_API_KEY>in the code below with your API Key and use this snippet:
import requests
session = requests.Session()
tc_out = session.post(
"https://slm.tanaos.com/models/topic-classification",
headers={
"X-API-Key": "<YOUR_API_KEY>",
},
json={
"text": "What do you think about the latest AI advancements?"
}
)
print(tc_out.json()["data"])
# >>> [{'label': 'technology', 'score': 0.9910}]
Model Description
- Base model:
FacebookAI/roberta-base - Task: Text classification (topic classification)
- Languages: English
- Fine-tuning data: A synthetic, custom dataset of 10,000 utterances, each belonging to one of 15 different topic categories.
Training Details
This model was trained using the Artifex Python library
pip install artifex
by providing the following instructions and generating 10,000 synthetic training samples:
from artifex import Artifex
topic_classification = Artifex().topic_classification
topic_classification.train(
domain="general",
classes={
"politics": "elections, policies, scandals, ideology",
"health": "physical health, mental health, fitness, diets, medical advice.",
"technology": "gadgets, software, AI, cybersecurity.",
"entertainment": "movies, TV shows, music, celebrities, streaming platforms.",
"money_finance": "investing, budgeting, crypto, real estate.",
"relationships_dating": "romance, breakups, marriage, family drama.",
"education_learning": "schools, universities, self-study, online courses.",
"work_careers": "job hunting, workplace culture, remote work, career advice.",
"science": "research, space, climate, biology, physics, chemistry and the scientific method.",
"society_culture": "identity, inequality, norms, language, and society.",
"gaming": "video games, esports, hardware, mods, and gaming culture.",
"lifestyle_hobbies": "travel, food, fashion, DIY, productivity systems.",
"sports": "teams, athletes, events, scores, and sports culture.",
"automotive": "cars, motorcycles, reviews, maintenance, and industry news.",
"other": "miscellaneous topics not covered by the other categories."
},
num_samples=10000
)
Intended Uses
This model is intended to:
- Classify conversations, reviews, articles, or any text into one of the predefined topic categories.
- Be used in applications such as chatbots, content categorization, and sentiment analysis.
- Serve as a lightweight alternative for topic classification tasks.
Not intended for:
- Use cases requiring extremely high accuracy or domain-specific knowledge without further fine-tuning.
- Downloads last month
- 1,323
Model tree for tanaos/tanaos-topic-classification-v1
Base model
FacebookAI/roberta-base