File size: 1,918 Bytes
e53de6e
 
 
 
 
 
 
 
 
 
 
 
ad99e2d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
license: apache-2.0
datasets:
- dair-ai/emotion
language:
- en
metrics:
- accuracy
- f1
- precision
- recall
pipeline_tag: text-classification
---



# Emotion Classification with BERT + RL Fine-tuning

This model combines BERT architecture with Reinforcement Learning (RL) for emotion classification. Initially fine-tuned on the `dair-ai/emotion` dataset (20k English sentences with 6 emotions), we then applied PPO reinforcement learning to optimize prediction behavior.

## 🔧 Training Approach

1. **Supervised Phase**:  
   - Base BERT model fine-tuned with cross-entropy loss  
   - Achieved strong baseline performance  

2. **RL Phase**:  
   - Implemented Actor-Critic architecture  
   - Policy Gradient optimization with custom rewards  
   - PPO clipping (ε=0.2) and entropy regularization  
   - Custom reward function: `+1.0` for correct, `-0.1` for incorrect predictions

## 📊 Performance Comparison

| Metric     | Pre-RL  | Post-RL | Δ       |
|------------|---------|---------|---------|
| Accuracy   | 0.9205  | 0.931   | +1.14%  |
| F1-Score   | 0.9227  | 0.9298  | +0.77%  |
| Precision  | 0.9325  | 0.9305  | -0.21%  |
| Recall     | 0.9205  | 0.931   | +1.14%  |

Key observation: RL fine-tuning provided modest but consistent improvements across most metrics, particularly in recall.

## 🚀 Usage

```python
from transformers import pipeline

# Load from your repository
classifier = pipeline("text-classification", 
                     model="SimoGiuffrida/SentimentRL",
                     tokenizer="bert-base-uncased")

results = classifier("I'm thrilled about this new opportunity!")
```

## 💡 Key Features
- Hybrid training: Supervised + Reinforcement Learning
- Optimized for nuanced emotion detection
- Handles class imbalance (see confusion matrix in repo)

For full training details and analysis, visit the [GitHub repository](https://github.com/SimoGiuffrida/DLA2).