Text Classification
Transformers
Safetensors
distilbert
sentiment-analysis
sentiment
synthetic data
multi-class
social-media-analysis
customer-feedback
product-reviews
brand-monitoring
multilingual
🇪🇺
region:eu
Synthetic
text-embeddings-inference
Instructions to use tabularisai/multilingual-sentiment-analysis with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use tabularisai/multilingual-sentiment-analysis with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="tabularisai/multilingual-sentiment-analysis")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("tabularisai/multilingual-sentiment-analysis") model = AutoModelForSequenceClassification.from_pretrained("tabularisai/multilingual-sentiment-analysis") - Inference
- Notebooks
- Google Colab
- Kaggle
exploring
#1
by HermannS11 - opened
- README.md +9 -77
- config.json +2 -1
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -23,7 +23,6 @@ language:
|
|
| 23 |
- tl
|
| 24 |
- nl
|
| 25 |
- gsw
|
| 26 |
-
- sw
|
| 27 |
library_name: transformers
|
| 28 |
license: cc-by-nc-4.0
|
| 29 |
pipeline_tag: text-classification
|
|
@@ -40,57 +39,26 @@ tags:
|
|
| 40 |
- multilingual
|
| 41 |
- 🇪🇺
|
| 42 |
- region:eu
|
| 43 |
-
- synthetic
|
| 44 |
-
datasets:
|
| 45 |
-
- tabularisai/swahili_sentiment_dataset
|
| 46 |
-
---
|
| 47 |
-
|
| 48 |
-
> [!TIP]
|
| 49 |
-
> 🚀 These models are now available through the Tabularis API.
|
| 50 |
-
> Fast multilingual sentiment + emotion classification in 23 languages with structured outputs and simple pricing.
|
| 51 |
-
>
|
| 52 |
-
> ✅ Free 10K credits/month
|
| 53 |
-
> 📚 Docs + API key: https://tabularis.ai/sentiment-analysis/
|
| 54 |
|
|
|
|
| 55 |
|
| 56 |
|
| 57 |
-
# 🚀 Multilingual Sentiment Classification Model
|
| 58 |
|
| 59 |
<!-- TRY IT HERE: `coming soon`
|
| 60 |
-->
|
| 61 |
-
|
| 62 |
-
-->
|
| 63 |
-
[](https://discord.gg/sznxwdqBXj)
|
| 64 |
-
|
| 65 |
-
# NEWS!
|
| 66 |
-
- 2025/8: Major model update +1 new language: **Swahili**! Also, general improvements accross all languages.
|
| 67 |
-
|
| 68 |
-
- 2025/8: Free DEMO API for our model! Please see below!
|
| 69 |
-
|
| 70 |
-
- 2025/7: We’ve just released ModernFinBERT, a model we’ve been working on for a while. It’s built on the ModernBERT architecture and trained on a mix of real and synthetic data, with LLM-based label correction applied to public datasets to fix human annotation errors.
|
| 71 |
-
It’s performing well across a range of benchmarks — in some cases improving accuracy by up to 48% over existing models like FinBERT.
|
| 72 |
-
You can check it out here on Hugging Face:
|
| 73 |
-
👉 https://huggingface.co/tabularisai/ModernFinBERT
|
| 74 |
|
| 75 |
|
| 76 |
-
#
|
| 77 |
|
| 78 |
-
We
|
| 79 |
-
|
| 80 |
-
**Example request body:**
|
| 81 |
-
|
| 82 |
-
```json
|
| 83 |
-
curl -X POST https://api.tabularis.ai/ \
|
| 84 |
-
-H "Content-Type: application/json" \
|
| 85 |
-
-d '{"text":"I love the design","return_all_scores":false}'
|
| 86 |
-
|
| 87 |
-
```
|
| 88 |
|
| 89 |
## Model Details
|
| 90 |
- `Model Name:` tabularisai/multilingual-sentiment-analysis
|
| 91 |
- `Base Model:` distilbert/distilbert-base-multilingual-cased
|
| 92 |
- `Task:` Text Classification (Sentiment Analysis)
|
| 93 |
-
- `Languages:` Supports English plus Chinese (中文), Spanish (Español), Hindi (हिन्दी), Arabic (العربية), Bengali (বাংলা), Portuguese (Português), Russian (Русский), Japanese (日本語), German (Deutsch), Malay (Bahasa Melayu), Telugu (తెలుగు), Vietnamese (Tiếng Việt), Korean (한국어), French (Français), Turkish (Türkçe), Italian (Italiano), Polish (Polski), Ukrainian (Українська), Tagalog, Dutch (Nederlands), Swiss German (Schweizerdeutsch)
|
| 94 |
- `Number of Classes:` 5 (*Very Negative, Negative, Neutral, Positive, Very Positive*)
|
| 95 |
- `Usage:`
|
| 96 |
- Social media analysis
|
|
@@ -101,8 +69,6 @@ curl -X POST https://api.tabularis.ai/ \
|
|
| 101 |
- Customer service optimization
|
| 102 |
- Competitive intelligence
|
| 103 |
|
| 104 |
-
|
| 105 |
-
|
| 106 |
## Model Description
|
| 107 |
|
| 108 |
This model is a fine-tuned version of `distilbert/distilbert-base-multilingual-cased` for multilingual sentiment analysis. It leverages synthetic data from multiple sources to achieve robust performance across different languages and cultural contexts.
|
|
@@ -212,46 +178,12 @@ for text, sentiment in zip(texts, predict_sentiment(texts)):
|
|
| 212 |
Synthetic data reduces bias, but validation in real-world scenarios is advised.
|
| 213 |
|
| 214 |
## Citation
|
| 215 |
-
```
|
| 216 |
-
|
| 217 |
-
author = {Vadim Borisov and Samuel Gyamfi and Richard H. Schreiber},
|
| 218 |
-
title = {Multilingual Sentiment Analysis},
|
| 219 |
-
year = {2025},
|
| 220 |
-
doi = {10.57967/hf/5968},
|
| 221 |
-
url = {https://huggingface.co/tabularisai/multilingual-sentiment-analysis},
|
| 222 |
-
publisher = {Hugging Face},
|
| 223 |
-
note = {Revision 69afb83}
|
| 224 |
-
}
|
| 225 |
```
|
| 226 |
|
| 227 |
## Contact
|
| 228 |
|
| 229 |
For inquiries, data, private APIs, better models, contact info@tabularis.ai
|
| 230 |
|
| 231 |
-
tabularis.ai
|
| 232 |
-
|
| 233 |
-
|
| 234 |
-
<table align="center">
|
| 235 |
-
<tr>
|
| 236 |
-
<td align="center">
|
| 237 |
-
<a href="https://www.linkedin.com/company/tabularis-ai/">
|
| 238 |
-
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/linkedin.svg" alt="LinkedIn" width="30" height="30">
|
| 239 |
-
</a>
|
| 240 |
-
</td>
|
| 241 |
-
<td align="center">
|
| 242 |
-
<a href="https://x.com/tabularis_ai">
|
| 243 |
-
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/x.svg" alt="X" width="30" height="30">
|
| 244 |
-
</a>
|
| 245 |
-
</td>
|
| 246 |
-
<td align="center">
|
| 247 |
-
<a href="https://github.com/tabularis-ai">
|
| 248 |
-
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/github.svg" alt="GitHub" width="30" height="30">
|
| 249 |
-
</a>
|
| 250 |
-
</td>
|
| 251 |
-
<td align="center">
|
| 252 |
-
<a href="https://tabularis.ai">
|
| 253 |
-
<img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/internetarchive.svg" alt="Website" width="30" height="30">
|
| 254 |
-
</a>
|
| 255 |
-
</td>
|
| 256 |
-
</tr>
|
| 257 |
-
</table>
|
|
|
|
| 23 |
- tl
|
| 24 |
- nl
|
| 25 |
- gsw
|
|
|
|
| 26 |
library_name: transformers
|
| 27 |
license: cc-by-nc-4.0
|
| 28 |
pipeline_tag: text-classification
|
|
|
|
| 39 |
- multilingual
|
| 40 |
- 🇪🇺
|
| 41 |
- region:eu
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 42 |
|
| 43 |
+
---
|
| 44 |
|
| 45 |
|
| 46 |
+
# 🚀 distilbert-based Multilingual Sentiment Classification Model
|
| 47 |
|
| 48 |
<!-- TRY IT HERE: `coming soon`
|
| 49 |
-->
|
| 50 |
+
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/Discord%20button.png" width="200"/>](https://discord.gg/sznxwdqBXj)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 51 |
|
| 52 |
|
| 53 |
+
# NEWS!
|
| 54 |
|
| 55 |
+
- 2024/12: We are excited to introduce a multilingual sentiment model! Now you can analyze sentiment across multiple languages, enhancing your global reach.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 56 |
|
| 57 |
## Model Details
|
| 58 |
- `Model Name:` tabularisai/multilingual-sentiment-analysis
|
| 59 |
- `Base Model:` distilbert/distilbert-base-multilingual-cased
|
| 60 |
- `Task:` Text Classification (Sentiment Analysis)
|
| 61 |
+
- `Languages:` Supports English plus Chinese (中文), Spanish (Español), Hindi (हिन्दी), Arabic (العربية), Bengali (বাংলা), Portuguese (Português), Russian (Русский), Japanese (日本語), German (Deutsch), Malay (Bahasa Melayu), Telugu (తెలుగు), Vietnamese (Tiếng Việt), Korean (한국어), French (Français), Turkish (Türkçe), Italian (Italiano), Polish (Polski), Ukrainian (Українська), Tagalog, Dutch (Nederlands), Swiss German (Schweizerdeutsch).
|
| 62 |
- `Number of Classes:` 5 (*Very Negative, Negative, Neutral, Positive, Very Positive*)
|
| 63 |
- `Usage:`
|
| 64 |
- Social media analysis
|
|
|
|
| 69 |
- Customer service optimization
|
| 70 |
- Competitive intelligence
|
| 71 |
|
|
|
|
|
|
|
| 72 |
## Model Description
|
| 73 |
|
| 74 |
This model is a fine-tuned version of `distilbert/distilbert-base-multilingual-cased` for multilingual sentiment analysis. It leverages synthetic data from multiple sources to achieve robust performance across different languages and cultural contexts.
|
|
|
|
| 178 |
Synthetic data reduces bias, but validation in real-world scenarios is advised.
|
| 179 |
|
| 180 |
## Citation
|
| 181 |
+
```
|
| 182 |
+
Will be included.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 183 |
```
|
| 184 |
|
| 185 |
## Contact
|
| 186 |
|
| 187 |
For inquiries, data, private APIs, better models, contact info@tabularis.ai
|
| 188 |
|
| 189 |
+
tabularis.ai
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
config.json
CHANGED
|
@@ -1,4 +1,5 @@
|
|
| 1 |
{
|
|
|
|
| 2 |
"activation": "gelu",
|
| 3 |
"architectures": [
|
| 4 |
"DistilBertForSequenceClassification"
|
|
@@ -34,6 +35,6 @@
|
|
| 34 |
"sinusoidal_pos_embds": false,
|
| 35 |
"tie_weights_": true,
|
| 36 |
"torch_dtype": "float32",
|
| 37 |
-
"transformers_version": "4.
|
| 38 |
"vocab_size": 119547
|
| 39 |
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"_name_or_path": "results/checkpoint-1400_best",
|
| 3 |
"activation": "gelu",
|
| 4 |
"architectures": [
|
| 5 |
"DistilBertForSequenceClassification"
|
|
|
|
| 35 |
"sinusoidal_pos_embds": false,
|
| 36 |
"tie_weights_": true,
|
| 37 |
"torch_dtype": "float32",
|
| 38 |
+
"transformers_version": "4.46.3",
|
| 39 |
"vocab_size": 119547
|
| 40 |
}
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 541326604
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3bb33a58e6056036c2b396c6971d3c7ebe916c7f2d7fb5bb46aa319ed3288ff8
|
| 3 |
size 541326604
|