ShanukaB commited on
Commit
51c4420
·
verified ·
1 Parent(s): afb543f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -0
README.md ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: ta
3
+ license: apache-2.0
4
+ tags:
5
+ - tamil
6
+ - emotion-classification
7
+ - text-classification
8
+ - fine-tuned
9
+ - multilingual
10
+ base_model: jusgowiturs/autotrain-tamil_emotion_11_tamilbert-2710380899
11
+ pipeline_tag: text-classification
12
+ ---
13
+
14
+ # Tamil Text Emotion Recognition Model
15
+
16
+ Fine-tuned Tamil language model for **11-class emotion classification** in Tamil text.
17
+ Detects: Ambiguous, Anger, Anticipation, Disgust, Fear, Joy, Love, Neutral, Sadness, Surprise, Trust.
18
+ Achieves ~94.5% accuracy on validation set after 6 epochs of fine-tuning.
19
+
20
+ ## Model Details
21
+
22
+ ### Model Description
23
+
24
+ - **Developed by:** Shanuka B Serasinghe
25
+ - **Shared by:** Shanuka B Serasinghe
26
+ - **Model type:** Text Classification (fine-tuned transformer for multi-class emotion detection)
27
+ - **Language(s) (NLP):** Tamil (தமிழ்)
28
+ - **License:** Apache-2.0
29
+ - **Finetuned from model:** jusgowiturs/autotrain-tamil_emotion_11_tamilbert-2710380899 (AutoTrain-generated Tamil-BERT style checkpoint)
30
+
31
+ ### Model Sources
32
+
33
+ - **Repository:** https://huggingface.co/ShanukaB/Tamil_Emotion_Recognition_Model
34
+
35
+
36
+ ## Uses
37
+
38
+ ### Direct Use
39
+
40
+ Direct inference with Hugging Face `pipeline` for classifying Tamil sentences/comments into one of 11 emotions.
41
+
42
+ ### Downstream Use
43
+
44
+ - Building emotion-aware Tamil chatbots
45
+ - Tamil social media sentiment & emotion monitoring
46
+ - Mental health & emotional wellbeing applications in Tamil
47
+ - Customer support systems with emotion detection
48
+ - Further research/fine-tuning in low-resource Tamil NLP
49
+
50
+ ### Out-of-Scope Use
51
+
52
+ - High-stakes automated decisions (e.g. mental health diagnosis, hiring, legal)
53
+ - Real-time safety-critical systems without human oversight
54
+ - Non-Tamil languages (performance will be very poor)
55
+
56
+ ## Bias, Risks, and Limitations
57
+
58
+ - Best performance on short-to-medium informal/colloquial Tamil text (social media style)
59
+ - Heavy code-mixing (Tamil + English) reduces accuracy
60
+ - Sarcasm, irony, subtle emotions, strong dialects, or very formal/literary Tamil may be misclassified
61
+ - Potential biases from training data (e.g. over-representation of certain topics/styles in emotion datasets)
62
+ - Not robust to adversarial inputs or out-of-distribution text
63
+
64
+ ### Recommendations
65
+
66
+ - Always combine model predictions with human review in sensitive use-cases
67
+ - Test thoroughly on your specific domain/dialect before deployment
68
+ - Report issues (especially dialect or code-mixed failures) to improve future versions
69
+
70
+ ## How to Get Started with the Model
71
+
72
+ ```python
73
+ from transformers import pipeline
74
+
75
+ classifier = pipeline(
76
+ "text-classification",
77
+ model="YOUR_USERNAME/YOUR_MODEL_NAME",
78
+ tokenizer="YOUR_USERNAME/YOUR_MODEL_NAME"
79
+ )
80
+
81
+ texts = [
82
+ "இது ரொம்ப அழகா இருக்கு! 🥰🥰",
83
+ "என்னடா இது… மிகவும் கோபமா வருது",
84
+ "யாரும் இல்லாம தனிமையா ஃபீல் பண்றேன் 😔",
85
+ "அடேங்கப்பா! இது எப்படி சாத்தியமா? 😲"
86
+ ]
87
+
88
+ for text in texts:
89
+ result = classifier(text)[0]
90
+ print(f"Text: {text}")
91
+ print(f"→ {result['label']} (confidence: {result['score']:.3f})\n")