Spaces:
Runtime error
Runtime error
File size: 4,518 Bytes
0210351 ab92e28 0210351 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
---
title: Vietnamese Sentiment Analysis
emoji: 🎭
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
---
# 🎭 Vietnamese Sentiment Analysis
A Vietnamese sentiment analysis web interface built with Gradio and transformer models, optimized for Hugging Face Spaces deployment.
## 🚀 Features
- **🤖 Transformer-based Model**: Uses 5CD-AI/Vietnamese-Sentiment-visobert from Hugging Face Hub
- **🌐 Interactive Web Interface**: Real-time sentiment analysis via Gradio
- **⚡ Memory Efficient**: Built-in memory management and batch processing limits
- **📊 Visual Analysis**: Confidence scores with interactive charts
- **📝 Batch Processing**: Analyze multiple texts at once
- **🛡️ Memory Management**: Real-time memory monitoring and cleanup
## 🎯 Usage
### Single Text Analysis
1. Enter Vietnamese text in the input field
2. Click "Analyze Sentiment"
3. View the sentiment prediction with confidence scores
4. See probability distribution in the chart
### Batch Analysis
1. Switch to "Batch Analysis" tab
2. Enter multiple Vietnamese texts (one per line)
3. Click "Analyze All" to process all texts
4. View comprehensive batch summary with sentiment distribution
### Memory Management
- Monitor real-time memory usage
- Use "Memory Cleanup" button if needed
- Automatic cleanup after each prediction
- Maximum 10 texts per batch for efficiency
## 📊 Model Details
- **Model**: 5CD-AI/Vietnamese-Sentiment-visobert
- **Architecture**: Transformer-based (XLM-RoBERTa)
- **Language**: Vietnamese
- **Labels**: Negative, Neutral, Positive
- **Max Sequence Length**: 512 tokens
- **Device**: Automatic CUDA/CPU detection
## 💡 Example Usage
Try these example Vietnamese texts:
- "Giảng viên dạy rất hay và tâm huyết." (Positive)
- "Môn học này quá khó và nhàm chán." (Negative)
- "Lớp học ổn định, không có gì đặc biệt." (Neutral)
## 🛠️ Technical Features
### Memory Optimization
- Automatic GPU cache clearing
- Garbage collection management
- Memory usage monitoring
- Batch size limits
- Real-time memory tracking
### Performance
- ~100ms processing time per text
- Supports up to 512 token sequences
- Efficient batch processing
- Memory limit: 8GB (Hugging Face Spaces)
## 📋 Model Performance
The model provides:
- **Sentiment Classification**: Positive, Neutral, Negative
- **Confidence Scores**: Probability distribution across classes
- **Real-time Processing**: Fast inference on CPU/GPU
- **Batch Analysis**: Efficient processing of multiple texts
## 🔧 Deployment
This Space is configured for Hugging Face Spaces with:
- **SDK**: Gradio 4.44.0
- **Hardware**: CPU (with CUDA support if available)
- **Memory**: 8GB limit with optimization
- **Model Loading**: Direct from Hugging Face Hub
## 📄 Requirements
See `requirements.txt` for complete dependency list:
- torch>=2.0.0
- transformers>=4.21.0
- gradio>=4.44.0
- pandas, numpy, scikit-learn
- psutil for memory monitoring
## 🎯 Use Cases
- **Education**: Analyze student feedback
- **Customer Service**: Analyze customer reviews
- **Social Media**: Monitor sentiment in posts
- **Research**: Vietnamese text analysis
- **Business**: Customer sentiment tracking
## 🔍 Troubleshooting
### Memory Issues
- Use "Memory Cleanup" button
- Reduce batch size
- Refresh the page if needed
### Model Loading
- Model loads automatically from Hugging Face Hub
- No local training required
- Automatic fallback to CPU if GPU unavailable
### Performance Tips
- Clear, grammatically correct Vietnamese text works best
- Longer texts (20-200 words) provide better context
- Use batch processing for multiple texts
## 📝 Citation
If you use this model or Space, please cite the original model:
```bibtex
@InProceedings{8573337,
author={Nguyen, Kiet Van and Nguyen, Vu Duc and Nguyen, Phu X. V. and Truong, Tham T. H. and Nguyen, Ngan Luu-Thuy},
booktitle={2018 10th International Conference on Knowledge and Systems Engineering (KSE)},
title={UIT-VSFC: Vietnamese Students' Feedback Corpus for Sentiment Analysis},
year={2018},
volume={},
number={},
pages={19-24},
doi={10.1109/KSE.2018.8573337}
}
```
## 🤝 Contributing
Feel free to:
- Submit issues and feedback
- Suggest improvements
- Report bugs
- Request new features
## 📄 License
This Space uses open-source components under MIT license.
---
**Try it now!** Enter some Vietnamese text above to see the sentiment analysis in action. 🎭 |