Files changed (3) hide show
  1. README.md +9 -77
  2. config.json +2 -1
  3. model.safetensors +1 -1
README.md CHANGED
@@ -23,7 +23,6 @@ language:
23
  - tl
24
  - nl
25
  - gsw
26
- - sw
27
  library_name: transformers
28
  license: cc-by-nc-4.0
29
  pipeline_tag: text-classification
@@ -40,57 +39,26 @@ tags:
40
  - multilingual
41
  - 🇪🇺
42
  - region:eu
43
- - synthetic
44
- datasets:
45
- - tabularisai/swahili_sentiment_dataset
46
- ---
47
-
48
- > [!TIP]
49
- > 🚀 These models are now available through the Tabularis API.
50
- > Fast multilingual sentiment + emotion classification in 23 languages with structured outputs and simple pricing.
51
- >
52
- > ✅ Free 10K credits/month
53
- > 📚 Docs + API key: https://tabularis.ai/sentiment-analysis/
54
 
 
55
 
56
 
57
- # 🚀 Multilingual Sentiment Classification Model (23 Languages)
58
 
59
  <!-- TRY IT HERE: `coming soon`
60
  -->
61
- <!-- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/Discord%20button.png" width="200"/>](https://discord.gg/sznxwdqBXj)
62
- -->
63
- [![Join Discord](https://img.shields.io/badge/Discord-Join%20community-5865F2?logo=discord&logoColor=white)](https://discord.gg/sznxwdqBXj)
64
-
65
- # NEWS!
66
- - 2025/8: Major model update +1 new language: **Swahili**! Also, general improvements accross all languages.
67
-
68
- - 2025/8: Free DEMO API for our model! Please see below!
69
-
70
- - 2025/7: We’ve just released ModernFinBERT, a model we’ve been working on for a while. It’s built on the ModernBERT architecture and trained on a mix of real and synthetic data, with LLM-based label correction applied to public datasets to fix human annotation errors.
71
- It’s performing well across a range of benchmarks — in some cases improving accuracy by up to 48% over existing models like FinBERT.
72
- You can check it out here on Hugging Face:
73
- 👉 https://huggingface.co/tabularisai/ModernFinBERT
74
 
75
 
76
- ## 🔌 Hosted DEMO API
77
 
78
- We provide a hosted inference API:
79
-
80
- **Example request body:**
81
-
82
- ```json
83
- curl -X POST https://api.tabularis.ai/ \
84
- -H "Content-Type: application/json" \
85
- -d '{"text":"I love the design","return_all_scores":false}'
86
-
87
- ```
88
 
89
  ## Model Details
90
  - `Model Name:` tabularisai/multilingual-sentiment-analysis
91
  - `Base Model:` distilbert/distilbert-base-multilingual-cased
92
  - `Task:` Text Classification (Sentiment Analysis)
93
- - `Languages:` Supports English plus Chinese (中文), Spanish (Español), Hindi (हिन्दी), Arabic (العربية), Bengali (বাংলা), Portuguese (Português), Russian (Русский), Japanese (日本語), German (Deutsch), Malay (Bahasa Melayu), Telugu (తెలుగు), Vietnamese (Tiếng Việt), Korean (한국어), French (Français), Turkish (Türkçe), Italian (Italiano), Polish (Polski), Ukrainian (Українська), Tagalog, Dutch (Nederlands), Swiss German (Schweizerdeutsch), and Swahili.
94
  - `Number of Classes:` 5 (*Very Negative, Negative, Neutral, Positive, Very Positive*)
95
  - `Usage:`
96
  - Social media analysis
@@ -101,8 +69,6 @@ curl -X POST https://api.tabularis.ai/ \
101
  - Customer service optimization
102
  - Competitive intelligence
103
 
104
-
105
-
106
  ## Model Description
107
 
108
  This model is a fine-tuned version of `distilbert/distilbert-base-multilingual-cased` for multilingual sentiment analysis. It leverages synthetic data from multiple sources to achieve robust performance across different languages and cultural contexts.
@@ -212,46 +178,12 @@ for text, sentiment in zip(texts, predict_sentiment(texts)):
212
  Synthetic data reduces bias, but validation in real-world scenarios is advised.
213
 
214
  ## Citation
215
- ```bib
216
- @misc{tabularisai2025multilingualsentiment,
217
- author = {Vadim Borisov and Samuel Gyamfi and Richard H. Schreiber},
218
- title = {Multilingual Sentiment Analysis},
219
- year = {2025},
220
- doi = {10.57967/hf/5968},
221
- url = {https://huggingface.co/tabularisai/multilingual-sentiment-analysis},
222
- publisher = {Hugging Face},
223
- note = {Revision 69afb83}
224
- }
225
  ```
226
 
227
  ## Contact
228
 
229
  For inquiries, data, private APIs, better models, contact info@tabularis.ai
230
 
231
- tabularis.ai
232
-
233
-
234
- <table align="center">
235
- <tr>
236
- <td align="center">
237
- <a href="https://www.linkedin.com/company/tabularis-ai/">
238
- <img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/linkedin.svg" alt="LinkedIn" width="30" height="30">
239
- </a>
240
- </td>
241
- <td align="center">
242
- <a href="https://x.com/tabularis_ai">
243
- <img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/x.svg" alt="X" width="30" height="30">
244
- </a>
245
- </td>
246
- <td align="center">
247
- <a href="https://github.com/tabularis-ai">
248
- <img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/github.svg" alt="GitHub" width="30" height="30">
249
- </a>
250
- </td>
251
- <td align="center">
252
- <a href="https://tabularis.ai">
253
- <img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/internetarchive.svg" alt="Website" width="30" height="30">
254
- </a>
255
- </td>
256
- </tr>
257
- </table>
 
23
  - tl
24
  - nl
25
  - gsw
 
26
  library_name: transformers
27
  license: cc-by-nc-4.0
28
  pipeline_tag: text-classification
 
39
  - multilingual
40
  - 🇪🇺
41
  - region:eu
 
 
 
 
 
 
 
 
 
 
 
42
 
43
+ ---
44
 
45
 
46
+ # 🚀 distilbert-based Multilingual Sentiment Classification Model
47
 
48
  <!-- TRY IT HERE: `coming soon`
49
  -->
50
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/Discord%20button.png" width="200"/>](https://discord.gg/sznxwdqBXj)
 
 
 
 
 
 
 
 
 
 
 
 
51
 
52
 
53
+ # NEWS!
54
 
55
+ - 2024/12: We are excited to introduce a multilingual sentiment model! Now you can analyze sentiment across multiple languages, enhancing your global reach.
 
 
 
 
 
 
 
 
 
56
 
57
  ## Model Details
58
  - `Model Name:` tabularisai/multilingual-sentiment-analysis
59
  - `Base Model:` distilbert/distilbert-base-multilingual-cased
60
  - `Task:` Text Classification (Sentiment Analysis)
61
+ - `Languages:` Supports English plus Chinese (中文), Spanish (Español), Hindi (हिन्दी), Arabic (العربية), Bengali (বাংলা), Portuguese (Português), Russian (Русский), Japanese (日本語), German (Deutsch), Malay (Bahasa Melayu), Telugu (తెలుగు), Vietnamese (Tiếng Việt), Korean (한국어), French (Français), Turkish (Türkçe), Italian (Italiano), Polish (Polski), Ukrainian (Українська), Tagalog, Dutch (Nederlands), Swiss German (Schweizerdeutsch).
62
  - `Number of Classes:` 5 (*Very Negative, Negative, Neutral, Positive, Very Positive*)
63
  - `Usage:`
64
  - Social media analysis
 
69
  - Customer service optimization
70
  - Competitive intelligence
71
 
 
 
72
  ## Model Description
73
 
74
  This model is a fine-tuned version of `distilbert/distilbert-base-multilingual-cased` for multilingual sentiment analysis. It leverages synthetic data from multiple sources to achieve robust performance across different languages and cultural contexts.
 
178
  Synthetic data reduces bias, but validation in real-world scenarios is advised.
179
 
180
  ## Citation
181
+ ```
182
+ Will be included.
 
 
 
 
 
 
 
 
183
  ```
184
 
185
  ## Contact
186
 
187
  For inquiries, data, private APIs, better models, contact info@tabularis.ai
188
 
189
+ tabularis.ai
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
config.json CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "activation": "gelu",
3
  "architectures": [
4
  "DistilBertForSequenceClassification"
@@ -34,6 +35,6 @@
34
  "sinusoidal_pos_embds": false,
35
  "tie_weights_": true,
36
  "torch_dtype": "float32",
37
- "transformers_version": "4.55.0",
38
  "vocab_size": 119547
39
  }
 
1
  {
2
+ "_name_or_path": "results/checkpoint-1400_best",
3
  "activation": "gelu",
4
  "architectures": [
5
  "DistilBertForSequenceClassification"
 
35
  "sinusoidal_pos_embds": false,
36
  "tie_weights_": true,
37
  "torch_dtype": "float32",
38
+ "transformers_version": "4.46.3",
39
  "vocab_size": 119547
40
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3ab3cecb8605da0a240e5b4e18d969704d44e27c6ea48533ef6693d31dbb926a
3
  size 541326604
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3bb33a58e6056036c2b396c6971d3c7ebe916c7f2d7fb5bb46aa319ed3288ff8
3
  size 541326604