--- title: Vietnamese Sentiment Analysis emoji: 🎭 colorFrom: green colorTo: blue sdk: gradio sdk_version: 4.44.0 app_file: app.py pinned: false --- # 🎭 Vietnamese Sentiment Analysis A Vietnamese sentiment analysis web interface built with Gradio and transformer models, optimized for Hugging Face Spaces deployment. ## 🚀 Features - **🤖 Transformer-based Model**: Uses 5CD-AI/Vietnamese-Sentiment-visobert from Hugging Face Hub - **🌐 Interactive Web Interface**: Real-time sentiment analysis via Gradio - **⚡ Memory Efficient**: Built-in memory management and batch processing limits - **📊 Visual Analysis**: Confidence scores with interactive charts - **📝 Batch Processing**: Analyze multiple texts at once - **🛡️ Memory Management**: Real-time memory monitoring and cleanup ## 🎯 Usage ### Single Text Analysis 1. Enter Vietnamese text in the input field 2. Click "Analyze Sentiment" 3. View the sentiment prediction with confidence scores 4. See probability distribution in the chart ### Batch Analysis 1. Switch to "Batch Analysis" tab 2. Enter multiple Vietnamese texts (one per line) 3. Click "Analyze All" to process all texts 4. View comprehensive batch summary with sentiment distribution ### Memory Management - Monitor real-time memory usage - Use "Memory Cleanup" button if needed - Automatic cleanup after each prediction - Maximum 10 texts per batch for efficiency ## 📊 Model Details - **Model**: 5CD-AI/Vietnamese-Sentiment-visobert - **Architecture**: Transformer-based (XLM-RoBERTa) - **Language**: Vietnamese - **Labels**: Negative, Neutral, Positive - **Max Sequence Length**: 512 tokens - **Device**: Automatic CUDA/CPU detection ## 💡 Example Usage Try these example Vietnamese texts: - "Giảng viên dạy rất hay và tâm huyết." (Positive) - "Môn học này quá khó và nhàm chán." (Negative) - "Lớp học ổn định, không có gì đặc biệt." (Neutral) ## 🛠️ Technical Features ### Memory Optimization - Automatic GPU cache clearing - Garbage collection management - Memory usage monitoring - Batch size limits - Real-time memory tracking ### Performance - ~100ms processing time per text - Supports up to 512 token sequences - Efficient batch processing - Memory limit: 8GB (Hugging Face Spaces) ## 📋 Model Performance The model provides: - **Sentiment Classification**: Positive, Neutral, Negative - **Confidence Scores**: Probability distribution across classes - **Real-time Processing**: Fast inference on CPU/GPU - **Batch Analysis**: Efficient processing of multiple texts ## 🔧 Deployment This Space is configured for Hugging Face Spaces with: - **SDK**: Gradio 4.44.0 - **Hardware**: CPU (with CUDA support if available) - **Memory**: 8GB limit with optimization - **Model Loading**: Direct from Hugging Face Hub ## 📄 Requirements See `requirements.txt` for complete dependency list: - torch>=2.0.0 - transformers>=4.21.0 - gradio>=4.44.0 - pandas, numpy, scikit-learn - psutil for memory monitoring ## 🎯 Use Cases - **Education**: Analyze student feedback - **Customer Service**: Analyze customer reviews - **Social Media**: Monitor sentiment in posts - **Research**: Vietnamese text analysis - **Business**: Customer sentiment tracking ## 🔍 Troubleshooting ### Memory Issues - Use "Memory Cleanup" button - Reduce batch size - Refresh the page if needed ### Model Loading - Model loads automatically from Hugging Face Hub - No local training required - Automatic fallback to CPU if GPU unavailable ### Performance Tips - Clear, grammatically correct Vietnamese text works best - Longer texts (20-200 words) provide better context - Use batch processing for multiple texts ## 📝 Citation If you use this model or Space, please cite the original model: ```bibtex @InProceedings{8573337, author={Nguyen, Kiet Van and Nguyen, Vu Duc and Nguyen, Phu X. V. and Truong, Tham T. H. and Nguyen, Ngan Luu-Thuy}, booktitle={2018 10th International Conference on Knowledge and Systems Engineering (KSE)}, title={UIT-VSFC: Vietnamese Students' Feedback Corpus for Sentiment Analysis}, year={2018}, volume={}, number={}, pages={19-24}, doi={10.1109/KSE.2018.8573337} } ``` ## 🤝 Contributing Feel free to: - Submit issues and feedback - Suggest improvements - Report bugs - Request new features ## 📄 License This Space uses open-source components under MIT license. --- **Try it now!** Enter some Vietnamese text above to see the sentiment analysis in action. 🎭