Files changed (3) hide show
  1. README.md +7 -68
  2. config.json +2 -1
  3. model.safetensors +1 -1
README.md CHANGED
@@ -23,7 +23,6 @@ language:
23
  - tl
24
  - nl
25
  - gsw
26
- - sw
27
  library_name: transformers
28
  license: cc-by-nc-4.0
29
  pipeline_tag: text-classification
@@ -40,50 +39,26 @@ tags:
40
  - multilingual
41
  - 🇪🇺
42
  - region:eu
43
- - synthetic
44
- datasets:
45
- - tabularisai/swahili_sentiment_dataset
46
  ---
47
 
48
 
49
- # 🚀 Multilingual Sentiment Classification Model (23 Languages)
50
 
51
  <!-- TRY IT HERE: `coming soon`
52
  -->
53
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/Discord%20button.png" width="200"/>](https://discord.gg/sznxwdqBXj)
54
 
55
 
56
- # NEWS!
57
- - 2025/8: Major model update +1 new language: **Swahili**! Also, general improvements accross all languages.
58
-
59
- - 2025/8: Free DEMO API for our model! Please see below!
60
-
61
- - 2025/7: We’ve just released ModernFinBERT, a model we’ve been working on for a while. It’s built on the ModernBERT architecture and trained on a mix of real and synthetic data, with LLM-based label correction applied to public datasets to fix human annotation errors.
62
- It’s performing well across a range of benchmarks — in some cases improving accuracy by up to 48% over existing models like FinBERT.
63
- You can check it out here on Hugging Face:
64
- 👉 https://huggingface.co/tabularisai/ModernFinBERT
65
 
66
  - 2024/12: We are excited to introduce a multilingual sentiment model! Now you can analyze sentiment across multiple languages, enhancing your global reach.
67
 
68
-
69
- ## 🔌 Hosted DEMO API
70
-
71
- We provide a hosted inference API:
72
-
73
- **Example request body:**
74
-
75
- ```json
76
- curl -X POST https://api.tabularis.ai/ \
77
- -H "Content-Type: application/json" \
78
- -d '{"text":"I love the design","return_all_scores":false}'
79
-
80
- ```
81
-
82
  ## Model Details
83
  - `Model Name:` tabularisai/multilingual-sentiment-analysis
84
  - `Base Model:` distilbert/distilbert-base-multilingual-cased
85
  - `Task:` Text Classification (Sentiment Analysis)
86
- - `Languages:` Supports English plus Chinese (中文), Spanish (Español), Hindi (हिन्दी), Arabic (العربية), Bengali (বাংলা), Portuguese (Português), Russian (Русский), Japanese (日本語), German (Deutsch), Malay (Bahasa Melayu), Telugu (తెలుగు), Vietnamese (Tiếng Việt), Korean (한국어), French (Français), Turkish (Türkçe), Italian (Italiano), Polish (Polski), Ukrainian (Українська), Tagalog, Dutch (Nederlands), Swiss German (Schweizerdeutsch), and Swahili.
87
  - `Number of Classes:` 5 (*Very Negative, Negative, Neutral, Positive, Very Positive*)
88
  - `Usage:`
89
  - Social media analysis
@@ -94,9 +69,6 @@ curl -X POST https://api.tabularis.ai/ \
94
  - Customer service optimization
95
  - Competitive intelligence
96
 
97
- > If you wish to use this model for commercial purposes, please obtain a license by contacting: info@tabularis.ai
98
-
99
-
100
  ## Model Description
101
 
102
  This model is a fine-tuned version of `distilbert/distilbert-base-multilingual-cased` for multilingual sentiment analysis. It leverages synthetic data from multiple sources to achieve robust performance across different languages and cultural contexts.
@@ -206,45 +178,12 @@ for text, sentiment in zip(texts, predict_sentiment(texts)):
206
  Synthetic data reduces bias, but validation in real-world scenarios is advised.
207
 
208
  ## Citation
209
- ```bib
210
- @misc{tabularisai_2025,
211
- author = { tabularisai and Samuel Gyamfi and Vadim Borisov and Richard H. Schreiber },
212
- title = { multilingual-sentiment-analysis (Revision 69afb83) },
213
- year = 2025,
214
- url = { https://huggingface.co/tabularisai/multilingual-sentiment-analysis },
215
- doi = { 10.57967/hf/5968 },
216
- publisher = { Hugging Face }
217
- }
218
  ```
219
 
220
  ## Contact
221
 
222
  For inquiries, data, private APIs, better models, contact info@tabularis.ai
223
 
224
- tabularis.ai
225
-
226
-
227
- <table align="center">
228
- <tr>
229
- <td align="center">
230
- <a href="https://www.linkedin.com/company/tabularis-ai/">
231
- <img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/linkedin.svg" alt="LinkedIn" width="30" height="30">
232
- </a>
233
- </td>
234
- <td align="center">
235
- <a href="https://x.com/tabularis_ai">
236
- <img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/x.svg" alt="X" width="30" height="30">
237
- </a>
238
- </td>
239
- <td align="center">
240
- <a href="https://github.com/tabularis-ai">
241
- <img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/github.svg" alt="GitHub" width="30" height="30">
242
- </a>
243
- </td>
244
- <td align="center">
245
- <a href="https://tabularis.ai">
246
- <img src="https://cdn.jsdelivr.net/gh/simple-icons/simple-icons/icons/internetarchive.svg" alt="Website" width="30" height="30">
247
- </a>
248
- </td>
249
- </tr>
250
- </table>
 
23
  - tl
24
  - nl
25
  - gsw
 
26
  library_name: transformers
27
  license: cc-by-nc-4.0
28
  pipeline_tag: text-classification
 
39
  - multilingual
40
  - 🇪🇺
41
  - region:eu
42
+
 
 
43
  ---
44
 
45
 
46
+ # 🚀 distilbert-based Multilingual Sentiment Classification Model
47
 
48
  <!-- TRY IT HERE: `coming soon`
49
  -->
50
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/Discord%20button.png" width="200"/>](https://discord.gg/sznxwdqBXj)
51
 
52
 
53
+ # NEWS!
 
 
 
 
 
 
 
 
54
 
55
  - 2024/12: We are excited to introduce a multilingual sentiment model! Now you can analyze sentiment across multiple languages, enhancing your global reach.
56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  ## Model Details
58
  - `Model Name:` tabularisai/multilingual-sentiment-analysis
59
  - `Base Model:` distilbert/distilbert-base-multilingual-cased
60
  - `Task:` Text Classification (Sentiment Analysis)
61
+ - `Languages:` Supports English plus Chinese (中文), Spanish (Español), Hindi (हिन्दी), Arabic (العربية), Bengali (বাংলা), Portuguese (Português), Russian (Русский), Japanese (日本語), German (Deutsch), Malay (Bahasa Melayu), Telugu (తెలుగు), Vietnamese (Tiếng Việt), Korean (한국어), French (Français), Turkish (Türkçe), Italian (Italiano), Polish (Polski), Ukrainian (Українська), Tagalog, Dutch (Nederlands), Swiss German (Schweizerdeutsch).
62
  - `Number of Classes:` 5 (*Very Negative, Negative, Neutral, Positive, Very Positive*)
63
  - `Usage:`
64
  - Social media analysis
 
69
  - Customer service optimization
70
  - Competitive intelligence
71
 
 
 
 
72
  ## Model Description
73
 
74
  This model is a fine-tuned version of `distilbert/distilbert-base-multilingual-cased` for multilingual sentiment analysis. It leverages synthetic data from multiple sources to achieve robust performance across different languages and cultural contexts.
 
178
  Synthetic data reduces bias, but validation in real-world scenarios is advised.
179
 
180
  ## Citation
181
+ ```
182
+ Will be included.
 
 
 
 
 
 
 
183
  ```
184
 
185
  ## Contact
186
 
187
  For inquiries, data, private APIs, better models, contact info@tabularis.ai
188
 
189
+ tabularis.ai
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
config.json CHANGED
@@ -1,4 +1,5 @@
1
  {
 
2
  "activation": "gelu",
3
  "architectures": [
4
  "DistilBertForSequenceClassification"
@@ -34,6 +35,6 @@
34
  "sinusoidal_pos_embds": false,
35
  "tie_weights_": true,
36
  "torch_dtype": "float32",
37
- "transformers_version": "4.55.0",
38
  "vocab_size": 119547
39
  }
 
1
  {
2
+ "_name_or_path": "results/checkpoint-1400_best",
3
  "activation": "gelu",
4
  "architectures": [
5
  "DistilBertForSequenceClassification"
 
35
  "sinusoidal_pos_embds": false,
36
  "tie_weights_": true,
37
  "torch_dtype": "float32",
38
+ "transformers_version": "4.46.3",
39
  "vocab_size": 119547
40
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3ab3cecb8605da0a240e5b4e18d969704d44e27c6ea48533ef6693d31dbb926a
3
  size 541326604
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3bb33a58e6056036c2b396c6971d3c7ebe916c7f2d7fb5bb46aa319ed3288ff8
3
  size 541326604