File size: 4,518 Bytes
0210351
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ab92e28
0210351
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
---
title: Vietnamese Sentiment Analysis
emoji: 🎭
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
---

# 🎭 Vietnamese Sentiment Analysis

A Vietnamese sentiment analysis web interface built with Gradio and transformer models, optimized for Hugging Face Spaces deployment.

## 🚀 Features

- **🤖 Transformer-based Model**: Uses 5CD-AI/Vietnamese-Sentiment-visobert from Hugging Face Hub
- **🌐 Interactive Web Interface**: Real-time sentiment analysis via Gradio
- **⚡ Memory Efficient**: Built-in memory management and batch processing limits
- **📊 Visual Analysis**: Confidence scores with interactive charts
- **📝 Batch Processing**: Analyze multiple texts at once
- **🛡️ Memory Management**: Real-time memory monitoring and cleanup

## 🎯 Usage

### Single Text Analysis
1. Enter Vietnamese text in the input field
2. Click "Analyze Sentiment"
3. View the sentiment prediction with confidence scores
4. See probability distribution in the chart

### Batch Analysis
1. Switch to "Batch Analysis" tab
2. Enter multiple Vietnamese texts (one per line)
3. Click "Analyze All" to process all texts
4. View comprehensive batch summary with sentiment distribution

### Memory Management
- Monitor real-time memory usage
- Use "Memory Cleanup" button if needed
- Automatic cleanup after each prediction
- Maximum 10 texts per batch for efficiency

## 📊 Model Details

- **Model**: 5CD-AI/Vietnamese-Sentiment-visobert
- **Architecture**: Transformer-based (XLM-RoBERTa)
- **Language**: Vietnamese
- **Labels**: Negative, Neutral, Positive
- **Max Sequence Length**: 512 tokens
- **Device**: Automatic CUDA/CPU detection

## 💡 Example Usage

Try these example Vietnamese texts:

- "Giảng viên dạy rất hay và tâm huyết." (Positive)
- "Môn học này quá khó và nhàm chán." (Negative)
- "Lớp học ổn định, không có gì đặc biệt." (Neutral)

## 🛠️ Technical Features

### Memory Optimization
- Automatic GPU cache clearing
- Garbage collection management
- Memory usage monitoring
- Batch size limits
- Real-time memory tracking

### Performance
- ~100ms processing time per text
- Supports up to 512 token sequences
- Efficient batch processing
- Memory limit: 8GB (Hugging Face Spaces)

## 📋 Model Performance

The model provides:
- **Sentiment Classification**: Positive, Neutral, Negative
- **Confidence Scores**: Probability distribution across classes
- **Real-time Processing**: Fast inference on CPU/GPU
- **Batch Analysis**: Efficient processing of multiple texts

## 🔧 Deployment

This Space is configured for Hugging Face Spaces with:
- **SDK**: Gradio 4.44.0
- **Hardware**: CPU (with CUDA support if available)
- **Memory**: 8GB limit with optimization
- **Model Loading**: Direct from Hugging Face Hub

## 📄 Requirements

See `requirements.txt` for complete dependency list:
- torch>=2.0.0
- transformers>=4.21.0
- gradio>=4.44.0
- pandas, numpy, scikit-learn
- psutil for memory monitoring

## 🎯 Use Cases

- **Education**: Analyze student feedback
- **Customer Service**: Analyze customer reviews
- **Social Media**: Monitor sentiment in posts
- **Research**: Vietnamese text analysis
- **Business**: Customer sentiment tracking

## 🔍 Troubleshooting

### Memory Issues
- Use "Memory Cleanup" button
- Reduce batch size
- Refresh the page if needed

### Model Loading
- Model loads automatically from Hugging Face Hub
- No local training required
- Automatic fallback to CPU if GPU unavailable

### Performance Tips
- Clear, grammatically correct Vietnamese text works best
- Longer texts (20-200 words) provide better context
- Use batch processing for multiple texts

## 📝 Citation

If you use this model or Space, please cite the original model:

```bibtex
@InProceedings{8573337,
  author={Nguyen, Kiet Van and Nguyen, Vu Duc and Nguyen, Phu X. V. and Truong, Tham T. H. and Nguyen, Ngan Luu-Thuy},
  booktitle={2018 10th International Conference on Knowledge and Systems Engineering (KSE)},
  title={UIT-VSFC: Vietnamese Students' Feedback Corpus for Sentiment Analysis},
  year={2018},
  volume={},
  number={},
  pages={19-24},
  doi={10.1109/KSE.2018.8573337}
}
```

## 🤝 Contributing

Feel free to:
- Submit issues and feedback
- Suggest improvements
- Report bugs
- Request new features

## 📄 License

This Space uses open-source components under MIT license.

---

**Try it now!** Enter some Vietnamese text above to see the sentiment analysis in action. 🎭